#TFCommunitySpotlight Winner: Firiuza Shigapova Using TensorFlow 2.x, Firiuza built a library for Graph Neural Networks containing GraphSage and GAT models for node and graph classification problems. Great work Firiuza! Firiuzas GitHub [goo.gle][twitter]

[D] Andrew Ng: Three challenges in AI deployments & the solutions[reddit]/r/MachineLearning

Many companies and research teams are struggling to turn research into practical production deployments. Andrew Ng shared some interesting points in a recent Stanford talk on Bridging AI's Proof-of-Concept to Production Gap

Challenges in AI deployments:

Small data is common among industrial applications outside consumer internet while AI research often works with big data [video section]

Algorithms’ robustness and generalizability. A model works in a published paper often does not work in production. [video section]

Change management. A model that automates one piece in a workflow can potentially impact the whole system and many stakeholders. [video section]

The solution:

Systematically planning out the full-cycle of machine learning projects, from scoping to data, modeling, and deployment. [video section]

[R] Scaling Laws for Autoregressive Generative Modeling (OpenAI)[reddit]https://arxiv.org/abs/2010.14701/r/MachineLearning• We identify the optimal model size Nopt(C) for a given compute budget, and find that it can be accurately modeled as a pure power law [?] Nopt ∝ Cβ (0.1) with a power β ∼ 0.7 for all modalities, as shown in figure

‹›
Abstract We identify empirical scaling laws for the cross-entropy loss in four domains: generative image modeling, video modeling, multimodal image↔text models, and mathematical problem solving. In all cases autoregressive Transformers smoothly improve in performance as model size and compute budgets increase, following a power-law plus constant scaling law. The optimal model size also depends on the compute budget through a power-law, with exponents that are nearly universal across all data domains. The cross-entropy loss has an information theoretic interpretation as S(True) + DKL(True||Model), and the empirical scaling laws suggest a prediction for both the true data distribution’s entropy and the KL divergence between the true and model distributions. With this interpretation, billion-parameter Transformers are nearly perfect models of the YFCC100M image distribution downsampled to an 8 × 8 resolution, and we can forecast the model size needed to achieve any given reducible loss (ie DKL) in nats/image for other resolutions. We find a number of additional scaling laws in specific domains: (a) we identify a scaling relation for the mutual information between captions and images in multimodal models, and show how to answer the question “Is a picture worth a thousand words?”; (b) in the case of mathematical problem solving, we identify scaling laws for model performance when extrapolating beyond the training distribution; (c) we finetune generative image models for ImageNet classification and find smooth scaling of the classification loss and error rate, even as the generative loss levels off. Taken together, these results strengthen the case that scaling laws have important implications for neural network performance, including on downstream tasks.

Constraints can spur creativity and incite action, as long as you have the confidence to embrace them. Amazing portfolio of 1000 programs that fit in a tweet of code and runs on an 8-bit virtual computer: [bbcmicrobot.com][twitter]

[R] References on the generalization theory of neural networks[reddit]/r/MachineLearning

Can someone point me out to the main papers on the generalization theory of neural networks?

I want to study this research subject from a mathematical perspective, and I can’t find some 5-10 relevant papers.

The literature is enormous and it’s hard to sort out for someone who is not working on the subject. I thought I could ask first machine learning researchers here.

I am not looking for experimental papers. My background is math PhD, and I don’t understand simulations.

CSV file containing the Wikidata id, title, lat/lng coordinates, and short description for all Wikipedia articles with location data (updated). [self-promotion][reddit]/r/datasets

I shared this a couple months ago, but recently discovered a bug that was leaving a bunch of locations out. This current version has 1,175,381 coordinates. I've also added short descriptions for articles (where possible).