2020-03-26: Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects https://arxiv.org/abs/2003.12045v1We demonstrate that jointly optimizing for both contact point prediction and keypoint projection error improves the results on both tasks in comparison to training models in isolation
When we humans look at a video of human-object interaction, we can not only
infer what is happening but we can even extract actionable information and
imitate those interactions. On the other hand, current recognition or geometric
approaches lack the physicality of action representation. In this paper, we
take a step towards a more physical understanding of actions. We address the
problem of inferring contact points and the physical forces from videos of
humans interacting with objects. One of the main challenges in tackling this
problem is obtaining ground-truth labels for forces. We sidestep this problem
by instead using a physics simulator for supervision. Specifically, we use a
simulator to predict effects and enforce that estimated forces must lead to the
same effect as depicted in the video. Our quantitative and qualitative results
show that (a) we can predict meaningful forces from videos whose effects lead
to accurate imitation of the motions observed, (b) by jointly optimizing for
contact point and force prediction, we can improve the performance on both
tasks in comparison to independent training, and (c) we can learn a
representation from this model that generalizes to novel objects using few shot
Learning to See Through Obstructions Remove unwanted obstructions such as reflections, fence or raindrops, from a short sequence of images captured by a moving camera. Training on synthetic data transfers well to real images. [edu.tw] pdf [arxiv.org][twitter]Integrating optical flow estimation and coarse-to-fine refinement enable our model to robustly recover the underlying clean image from challenging real-world sequences
In essence, the argument of Zachary is that the basic idea was stolen and publish already before AllenAI even started the project.
The counterargument by Matt is that their paper focuses on NLP and looks at the problem from a geometric point of view, compared to a "causality" viewpoint taken in the ICLR paper.
The allegations made by Zachary are quite harsh. Imagine you are in a situation where, either A someone accuses you of plagiarizing their paper, or B someone publishes a paper with the same idea as on of your already published papers. How would you react?
Disclaimer: I am not taking any sides in this argument
2020-04-02: Semantic Segmentation of Underwater Imagery: Dataset and Benchmark https://arxiv.org/abs/2004.01241v1In this paper, we attempt to address these limitations by presenting the first large-scale annotated dataset for general-purpose semantic segmentation of underwater scenes
In this paper, we present the first large-scale dataset for semantic
Segmentation of Underwater IMagery (SUIM). It contains over 1500 images with
pixel annotations for eight object categories: fish (vertebrates), reefs
(invertebrates), aquatic plants, wrecks/ruins, human divers, robots, and
sea-floor. The images are rigorously collected during oceanic explorations and
human-robot collaborative experiments, and annotated by human participants. We
also present a comprehensive benchmark evaluation of several state-of-the-art
semantic segmentation approaches based on standard performance metrics.
Additionally, we present SUIM-Net, a fully-convolutional deep residual model
that balances the trade-off between performance and computational efficiency.
It offers competitive performance while ensuring fast end-to-end inference,
which is essential for its use in the autonomy pipeline by visually-guided
underwater robots. In particular, we demonstrate its usability benefits for
visual servoing, saliency prediction, and detailed scene understanding. With a
variety of use cases, the proposed model and benchmark dataset open up
promising opportunities for future research on underwater robot vision.
2020-04-02: Learning to See Through Obstructions https://arxiv.org/abs/2004.01180v1Integrating optical flow estimation and coarse-to-fine refinement enable our model to robustly recover the underlying clean image from challenging real-world sequences
We present a learning-based approach for removing unwanted obstructions, such
as window reflections, fence occlusions or raindrops, from a short sequence of
images captured by a moving camera. Our method leverages the motion differences
between the background and the obstructing elements to recover both layers.
Specifically, we alternate between estimating dense optical flow fields of the
two layers and reconstructing each layer from the flow-warped images via a deep
convolutional neural network. The learning-based layer reconstruction allows us
to accommodate potential errors in the flow estimation and brittle assumptions
such as brightness consistency. We show that training on synthetically
generated data transfers well to real images. Our results on numerous
challenging scenarios of reflection and fence removal demonstrate the
effectiveness of the proposed method.