[r] Surprisal-Triggered Conditional Computation with Neural Networks[reddit]https://arxiv.org/abs/2006.01659/r/MachineLearningWe have shown that it is possible to use the surprisal of an autoregressive model to determine whether to use a big neural network or a small neural network for processing a stream of inputs, using the big network only for more difficult inputs and thereby reducing overall computation—in one instance reducing FLOP count by 15% at no cost in accuracy
Abstract Autoregressive neural network models have been used successfully for sequence generation, feature extraction, and hypothesis scoring. This paper presents yet another use for these models: allocating more computation to more difficult inputs. In our model, an autoregressive model is used both to extract features and to predict observations in a stream of input observations. The surprisal of the input, measured as the negative log-likelihood of the current observation according to the autoregressive model, is used as a measure of input difficulty. This in turn determines whether a small, fast network, or a big, slow network, is used. Experiments on two speech recognition tasks show that our model can match the performance of a baseline in which the big network is always used with 15% fewer FLOPs.
Impact of Student Debt or General Debt on Mental Health[reddit]/r/datasets
Hi, I am looking for a dataset to identify the connection / relation between debt and mental health disorders. I know that these datasets are sensitive and are very difficult to find but I was wondering if someone has ever come across with anything anonymized related to this topic.
How do you learn labels without labels? How do you classify images when you don't know what to classify them into? This paper investigates a new combination of representation learning, clustering, and self-labeling in order to group visually similar images together - and achieves surprisingly high accuracy on benchmark datasets.
Historical Real Estate Listings Data[reddit]/r/datasets
Hi, I'm looking to get real estate listing data for a personal project but I've run into dead ends in every way I've tried. I've looked into Zillow, RedFin, and other places with their APIs and available data sets but they either limit requests to a laughable size, charge an astronomical price, only provide averages instead of listing or don't have historical. I know this MLS data is usually highly protected, but I figured it would be worth a shot if anyone had ideas on any dumps or releases that had been done.
I'm looking to build a machine learning model for price forecasting, so I'd like the dataset to include geolocation, square feet, color, quality, type etc. as shown on a listing site.
If anyone could provide some insight on how I could go about obtaining this I'd very much appreciate it.
Extensive football mathces and betting datasets[reddit]/r/datasets
Hi there all!
We have just started working the idea of a fraud detection system regarding football matches. However, we are stuck at the stage of finding structured data sources regarding football mathces of several leagues (major or minor ones) alongside with extensive betting statistics about them.
Are there any such datasets available online?
Also, are there any extensive datasets including all/most of the known fixed mathces, at least in major scandals - e.g. Calciopolis - alongside with betting statistics etc?
[D] On the "Broader Impact" statement for NeuRIPS[reddit]/r/MachineLearning
I hope many of you are busy finishing up your NeuRIPS publication. Good luck to all.
As I craft my Broader Impact statement. One thing in particular worries me. Certainly the originators of this proposal are quite pleased that this is included. I do not deny that technology, machine learning, and AI has a broader impact, and its useful to consider this broader impact.
I, however, am puzzled as to whether a conference publication is the appropriate venue for this discussion. The reason being that I worry that our work will be judged by this broader impact, yet the question is whether this judgement will be fair and impartial. What is considered positive or negative is a cultural, subjective matter. My worry being that how is it fair or appropriate to judge the merits of a work by a foreign country's researcher, by the cultural viewpoint of your own? I worry that this will be an excuse to exercise certain western viewpoints of correctness, morality, and fairness on the world's researchers, whom may not share in these viewpoints. I worry that this will allow certain political disagreements between countries to be acted upon by rejecting papers not conforming to the western viewpoint of correctness, morality or fairness.
Do you think this is likely? Do you think this is fair, or ok?
[D] Have the past few months provided evidence that distributed, remote research labs are possible now?[reddit]/r/MachineLearning
I'm an early career researcher in a fairly large research group, and I've always been involved in collaborations of some sort or the other that are remote over the past few years. But recently with COVID and the other issues happening, I've noticed a lot more potential collaborators are open to working remotely on projects. I reach out to well-known researchers at University and even in Industry, and it anecdotally seems like more people are down to work on projects with distributed teams. Is anyone else seeing this too?
I'm sort of thinking about starting a formal collaboration that is entirely remote and ongoing. I’ve been discussing with lots of my colleagues and friends across research and engineering, and it feels like there is so many low hanging fruit in bridging the gap between research and application in different ML topics like Meta-Learning.
Do you think there is an opportunity for a distributed research organization to build practical useful ML tools? I know many researchers in a cluster of related research areas that are interested but I'm not sure what the best path forward is. A flexible, online research collaboration seems reasonable.
I am looking for syslog data from Cisco devices including firewalls, routers, switches, and Intrusion Prevention/Detection Systems (IPDS) that we can run through the tool that we are developing.
The demonstration will simulate that data is being monitored and ingested through the tools data flow. The tool will read the syslog data and determine weather it is a threat (based on set rules) and then either store the file in an archive no further action required or generate a report for the analyst or send an immediate alert.
If anyone know where can find this kind of sample, Thank You in advance.
Reinforcement Learning (RL) provides an elegant formalization for the problem of intelligence.
In combination with advances in deep learning and increases in computation, this formalization has resulted in powerful solutions to longstanding artificial intelligence challenges — e.g. playing Go at a championship level. We believe it also offers an avenue for solving some of our greatest challenges: from drug design to industrial and space robotics, or improving energy efficiency in a variety of applications.
However, in this pursuit, the scale and complexity of RL programs has grown dramatically over time. This has made it increasingly difficult for researchers to rapidly prototype ideas, and has caused serious reproducibility issues. To address this, we are launching Acme — a tool to increase reproducibility in RL and simplify the ability of researchers to develop novel and creative algorithms.
Acme is a framework for building readable, efficient, research-oriented RL algorithms. At its core Acme is designed to enable simple descriptions of RL agents that can be run at various scales of execution — including distributed agents. By releasing Acme, our aim is to make the results of various RL algorithms developed in academia and industrial labs easier to reproduce and extend for the machine learning community at large.