Search Unity

Unity ML-Agents Toolkit v0.4 and Udacity Deep Reinforcement Learning Nanodegree

, June 19, 2018

We are happy to announce the release of the latest version of ML-Agents Toolkit: v0.4. It contains a number of features, which we hope everyone will enjoy.

It includes the option to train your environments directly from the editor, rather than as built executables, making iteration time much quicker. In addition, we are introducing a set of new challenging environments, as well as algorithmic improvements to help the agents learn to solve tasks that might previously only be learned with great difficulty or in some cases not at all. You can try out the new release by going to our GitHub release page. More exciting news –  we are partnering with Udacity to launch an online education program – Deep Reinforcement Learning Nanodegree. Read on below to learn more.


We include two new environments with our latest release: Walker and Pyramids. Walker is physics-based humanoid ragdoll and Pyramids is a complex sparse-reward environment.


The first new example environment we are including is called “Walker.” It contains agents which are humanoid ragdolls. They are completely physics-based, so the goal is for the agent to learn to control its limbs in a way that can allow it to walk forward. It learns this… with somewhat humorous results. Since there are many degrees of freedom in the agent’s body, we think this can serve as a great benchmark for Reinforcement Learning algorithms that research might develop.


The second new environment is called “Pyramids.” It features the return of our favorite blue cube agent. Rather than collecting bananas or hopping over walls, this time around the agent has to get to a golden brick atop a pyramid of other bricks. The trick, however, is that this pyramid only appears once a randomly placed switch has been activated. The agent only gets a positive reward upon reaching the brick, making this a very sparse-rewarding environment.

Additional environment variations

Additionally, we are providing visual observation and imitation learning versions of many of our existing environments. The visual observation environments, in particular, are designed as a challenge for researchers interested in benchmarking neural network models which utilize convolutional neural networks (CNNs).

To learn more about our provided example environments, follow this link.

Improved learning with Curiosity

To help agents solve tasks in which the rewards are fewer and far between, we’ve added an optional augmentation to our PPO algorithm. That augmentation is an implementation of the Intrinsic Curiosity Module, as described in this research paper from last year. In essence, the addition allows the agent to reward itself using an intrinsic reward signal based on how surprised it is by the outcome of its actions. This will enable it to more easily and frequently solve very sparse-reward environments, such as the Pyramid environment described above.

In-Editor training

One feature which has been requested since the announcement of ML-Agents toolkit is the ability to perform training from within the Unity Editor. We are happy to be taking the first step toward that goal in this release.  It is now possible to simply launch the script, and then press the “play” button from within the editor to perform training. This will allow training to happen without having to build an executable and allows for faster iterations. We think this will save our users a lot of time, as well as shortening the gap between traditional game development workflows and the ML-Agents training process. This is made possible by a revamping of our communication system. Our improvements to the developer workflow will not stop here though. This is just the first step toward even closer integration with the Unity Editor which will be rolling out throughout 2018.

TensorFlowSharp upgrade

Lastly, we are happy to share that the TensorFlowSharp plugin has now been upgraded from 1.4 to 1.7.1. This means that developers and researchers can now use Unity ML-Agents Toolkit with models built using the near-latest version of TensorFlow and maintain compatibility between the models they train and the models they can embed into Unity projects. We have also improved our documentation around creating Android and iOS executables which take advantage of ML-Agents toolkit. You can check it out here.

Udacity Deep Reinforcement Learning Nanodegree

We are proud to announce that we are partnering with Udacity on a new nanodegree to help students and our community of users who want a deeper understanding of reinforcement learning.  This Udacity course uses ML-Agents toolkit as a way to illustrate and teach the various concepts. If you’ve been using ML-Agents toolkit or want to know the math, algorithms, and theories behind reinforcement learning, sign up.


In addition to the features described above, we’ve also improved the performance of PPO, fixed a number of bugs, and improved the quality of tests provided with the ML-Agents codebase. As always, we welcome any feedback which you might have. Feel free to reach out to us on our GitHub issues page, or email us directly at

15 replies on “Unity ML-Agents Toolkit v0.4 and Udacity Deep Reinforcement Learning Nanodegree”

I like that you mentioned something about Android. Have you, or anyone tested ML agents on Android? Would really like to know if it’s possible, and also the performance on mobile.

It amazing how fast this is unfolding. Shame the Udacity course costs so much. I won’t be able to afford it until I get a job out of college in a few years. Then again with student loans it might be 6 years till I can afford it.

A Udacity course sounds like a great idea. But if the aim of Unity ML is democratizing machine learning in Unity, then I think it would have been good to have started with a more entry level program. Looks super interesting, but it looks a bit beyond where my knowledge is at the moment.

Well, it seems that the course is not available in China (that’s why 404 shows up). So try it with a VPN

I just really hope this is all going to feed into anything to do with helping us make games..

Great news! ML-Agents workflow indeed needed some improvements in order to be more accessable for casual Unity hobbiests, can’t wait to check it out.

Great progress! I will test it out. Note: disappointed that there is no free track in the Udacity courses. I like the model where you can audit the courses for free without getting an official statement of completion and you can upgrade later for a price to get the certificate if you feel like it.

Comments are closed.