Monday, October 28, 2019

What Is Reinforcement Learning?

Put simply, reinforcement learning is a machine learning technique that involves training an artificial intelligence agent through the repetition of actions and associated rewards. A reinforcement learning agent experiments in an environment, taking actions and being rewarded when the correct actions are taken. Over time, the agent learns to take the actions that will maximize its reward. That’s a quick definition of reinforcement learning, but taking a closer look at the concepts behind reinforcement learning will help you gain a better, more intuitive understanding of it.

Reinforcement In Psychology

The term “reinforcement learning” is adapted from the concept of reinforcement in psychology. For that reason, let’s take a moment to understand the psychological concept of reinforcement. In the psychological sense, the term reinforcement refers to something that increases the likelihood that a particular response/action will occur. This concept of reinforcement is a central idea of the theory of operant conditioning, initially proposed by the psychologist B.F. Skinner. In this context, reinforcement is anything that causes the frequency of a given behavior to increase. If we think about possible reinforcement for humans, these can be things like praise, a raise at work, candy, and fun activities.

In the traditional, psychological sense, there are two types of reinforcement. There’s positive reinforcement and negative reinforcement. Positive reinforcement is the addition of something to increase a behavior, like giving your dog a treat when it is well behaved. Negative reinforcement involves removing a stimulus to elicit a behavior, like shutting off loud noises to coax out a skittish cat.

Positive and Negative Reinforcement In Machine Learning

Positive reinforcement increases the frequency of a behavior while negative reinforcement decreases the frequency. In general, positive reinforcement is the most common type of reinforcement used in reinforcement learning, as it helps models maximize the performance on a given task. Not only that but positive reinforcement leads the model to make more sustainable changes, changes which can become consistent patterns and persist for long periods of time.

In contrast, while negative reinforcement also makes a behavior more likely to occur, it is used for maintaining a minimum performance standard rather than reaching a model’s maximum performance. Negative reinforcement in reinforcement learning can help ensure that a model is kept away from undesirable actions, but it can’t really make a model explore desired actions.

Training A Reinforcement Agent

When a reinforcement learning agent is trained, there are four different ingredients or states used in the training: initial states (State 0), new state (State 1), actions, and rewards.

Imagine that we are training a reinforcement agent to play a platforming video game where the AI’s goal is to make it to the end of the level by moving right across the screen. The initial state of the game is drawn from the environment, meaning the first frame of the game is analyzed and given to the model. Based on this information, the model must decide on an action.

During the initial phases of training, these actions are random but as the model is reinforced, certain actions will become more common. After the action is taken the environment of the game is updated and a new state or frame is created. If the action taken by the agent produced a desirable result, let’s say in this case that the agent is still alive and hasn’t been hit by an enemy, some reward is given to the agent and it becomes more likely to do the same in the future.

This basic system is constantly looped, happening again and again, and each time the agent tries to learn a little more and maximize its reward.

Episodic vs Continuous Tasks

Reinforcement learning tasks can typically be placed in one of two different categories: episodic tasks and continual tasks.

Episodic tasks will carry out the learning/training loop and improve their performance until some end criteria are met and the training is terminated. In a game, this might be reaching the end of the level or falling into a hazard like spikes. In contrast, continual tasks have no termination criteria, essentially continuing to train forever until the engineer chooses to end the training.

Monte Carlo vs Temporal Difference

There are two primary ways of learning, or training, a reinforcement learning agent. In the Monte Carlo approach, rewards are delivered to the agent (its score is updated) only at the end of the training episode. To put that another way, only when the termination condition is hit does the model learn how well it performed. It can then use this information to update and when the next training round is started it will respond in accordance to the new information.

The temporal-difference method differs from the Monte Carlo method in that the value estimation, or the score estimation, is updated during the course of the training episode. Once the model advances to the next time step the values are updated.

Explore vs Exploit

Training a reinforcement learning agent is a balancing act, involving the balancing of two different metrics: exploration and exploitation.

Exploration is the act of collecting more information about the surrounding environment, while exploration is using the information already known about the environment to earn reward points. If an agent only explores and never exploits the environment, the desired actions will never be carried out. On the other hand, if the agent only exploits and never explores, the agent will only learn to carry out one action and won’t discover other possible strategies of earning rewards. Therefore, balancing exploration and exploitation is critical when creating a reinforcement learning agent.

Uses For Reinforcement Learning

Reinforcement learning can be used in a wide variety of roles, and it is best suited for applications where tasks require automation.

Automation of tasks to be carried out by industrial robots is one area where reinforcement learning proves useful. Reinforcement learning can also be used for problems like text mining, creating models that are able to summarize long bodies of text. Researchers are also experimenting with using reinforcement learning in the healthcare field, with reinforcement agents handling jobs like the optimization of treatment policies. Reinforcement learning could also be used to customize educational material for students.

Concluding Thoughts

Reinforcement learning is a powerful method of constructing AI agents that can lead to impressive and sometimes surprising results. Training an agent through reinforcement learning can be complex and difficult, as it takes many training iterations and a delicate balance of the explore/exploit dichotomy. However, if successful, an agent created with reinforcement learning can carry out complex tasks under a wide variety of different environments

Failure is good – how ‘black box thinking’ will change the way we learn about AI

All paths to success lead through failure, you have to change your perspective on it. Let’s apply the ‘logic of failure’ to artificial intelligence

By Anita Constantine, Constellation AI 22 October 2019

In a brave new world, it’s not just the brave who must stand accountable to prevent the repetition of historical errors. We live in a morass of political cover-ups, data breaches and capitalist control: now, more than ever, is the time for radical transparency. A change to regulatory mindset must happen, so we’re applying Matthew Syed’s theory of ‘Black Box thinking and logic of failure’ to Artificial Intelligence.

Too often in a social or business hierarchy, we feel unable to challenge enduring practices and behaviours. We abide by the rules and regulations we might know to be outdated and inefficient; we might witness dangerous error or negligence, and yet feel unable to challenge figures of authority. Negative loops perpetuate when people do not investigate error, especially when they suspect they may have made a mistake themselves. But, when insight can prevent future error, why withhold it? The only way to learn from failure, is to change our perspective of it, to understand that it isn’t necessarily a bad thing.

In aviation, if something — or someone — fails on board the aircraft, the indestructible Black Box will detect and record it. Independent review bodies have been established to monitor findings, with the sole purpose of discovery. The process is almost impossible to cover up. Rather than chase culpability, the findings are noted and shared throughout the industry; everyone has access to the data, so that everyone can implement their learnings. Once pilots were protected, they came forward to discuss error and failure became a necessary step in learning: it altered industry regulation. In aviation, failure is ‘data-rich’. Countless lives have been saved as a result.

In medicine, there are numerous reasons why a human or system might fail during surgery or patient care. In the past, mistakes have been silenced in fear of recrimination, and vital opportunities to learn were discarded. Last year, the NHS spent £2.6 billion in litigation for medical errors and negligence, funds that could have been far better placed elsewhere. Mistakes aren’t a waste of valuable resources, they are a way of safe-guarding them. Speaking up on current failures can help us to avoid catastrophic failures in the future. In order to create a transparent environment in which we can progress from error, we need to move to from a blame culture, to a learning culture. To study the environment and systems in which mistakes happen — to understand what went wrong, and to divulge the lessons learned.

In ‘Black Box Thinking — The Surprising Truth About Success (and Why Some People Never Learn from Mistakes)’, Matthew Syed calls for a new future of transparency and a change to the mindset of failure. These principles, according to Syed, are about “the willingness and tenacity to investigate the lessons that often exist when we fail, but which we rarely exploit. It is about creating systems and cultures that enable organisations to learn from errors, rather than being threatened by them.” By changing your relationship with failure to a positive one, you’ll learn to stop avoiding it.

“ALL PATHS TO SUCCESS LEAD THROUGH FAILURE AND WHAT YOU CAN DO TO CHANGE YOUR PERSPECTIVE ON IT. ADMIT YOUR MISTAKES AND BUILD YOUR OWN BLACK BOX TO CONSISTENTLY LEARN AND IMPROVE FROM THE FEEDBACK FAILURE GIVES YOU”.
– Matthew Syed, ’Black Box Thinking’

The AI black box

Let’s apply this ‘logic of failure’ to Artificial Intelligence as an alternative approach to regulation, with transparency and learning based on ‘Black Box thinking’ and aviation.

Contrary to hard-line AI ethicists — who may have a fatalistic view on punishment when things go wrong — a ‘Black Box thinking’ approach allows us to be real in how we react to and deal with issues. How we work to solve them and how we translate that to the rest of the industry, so that others might learn too.

In any industry, applying intelligent systems to the challenges we face is likely to result in unintended consequences. It’s not always obvious how to identify hazards, or even to ask the right questions, and there is always the chance that something will go wrong; therefore, no business should be placed on a pedestal. We need to collect data, spot meaningful patterns, and learn from them – taking in to account not only the information you can see, but the information you can’t. Using ‘deliberate practice’, you can consistently measure margins of error, readjusting them each time. This can be applied to every part of human learning. How can we progress and innovate if we cannot learn? How can we learn if we can’t admit to our mistakes?

What we can do, is respond with transparency, accountability and proactivity to those consequences: to be trusted to do the right thing, to challenge industry standards and to consistently work on improving them. We must not build an industry on silence, in fear of being vilified. Instead of extreme punishment, we should create the space and processes to learn and share knowledge, using root-cause analysis, so that issues are not repeated elsewhere. We need to gather the best perspectives and opinions; with experts coalescing to challenge and debate industry standards. By doing this, AI will advance more effectively and safely, and society would reap the rewards.

Artificial Intelligence is not a new industry, it is a new age. To live successfully in this brave new world, we must readjust our thinking and be just that: brave. In good hands, technology and artificial intelligence can turbo-charge the power of learning. We’d get to a better place, faster, if we can hold people accountable and resolve issues in public. We must have the courage to face the future with openness and honesty. To not be afraid of failure, and admit to it, for the sake of learning.

Three Realities of Artificial Intelligence as We Approach 2020

Deep Learning technology has enabled a democratization of Artificial Intelligence: it used to be that you needed a team of people to describe AI’s features and it was a long process, with another team of qualified PhDs required to deploy algorithms.

But nowadays, we have Ph.D. students doing internships where they produce really valuable and viable production-ready results. There are also resources such as TensorFlow , an open-source Machine Learning library that anyone can play around with, as well as hundreds of AI-focused online courses and summer schools. The democratization of Artificial Intelligence has truly been a revolution, and something we should be proud of.

However, even though we’re not far from solving many of the world’s problems with Artificial Intelligence, if we’re not looking carefully at how Artificial Intelligence is being deployed, we may become complacent and miss out on breakthroughs and opportunities to achieve far greater things.

So how should we be thinking of AI today? Here are three observations that may help companies considering the use of Artificial Intelligence or those who might be questioning its evolution over the past few years.

We Shouldn’t Be So Scared of Artificial Intelligence

A lot of people fear the consequences of AI making its own decisions, but the reality is that humans are likely to always have more influence than it may appear at this point in time.

I’m a big believer in the hybrid human-AI model, and I think even when we do create ‘super intelligence’, there is going to be a human component embedded there. We’re going to see a merge between human and AI brains. We’re already offloading a lot of our brains into machines on a daily basis: how many of us remember phone numbers anymore?

The question is more: if we’ve got a lot of computational power, what do we direct that computational power towards? These decisions, as well as what we want to solve and create, will likely always have some form of human interaction. We’ll definitely be seeing a combination of human and machine resources when it comes to applying computational power and outcomes.

We’re Not Going to Create a God-Like Algorithm Anytime Soon

What does it mean to create algorithms that are unbiased? The problem is, we can’t eliminate biases completely. So many of our biases and decisions come from many different factors: many are inherent, moral opinions about how things should happen and how society should behave.

As humans, we struggle to get past our own biases, so we may need to accept what a challenge it will be to get rid of machine-learned bias. The best that we can hope to do now is make the best decisions we can as humans, and accept progress over perfection. […]

How will artificial intelligence affect your business?

Artificial intelligence is the most exciting upcoming technology in the world today. The advent of autonomous vehicles and the prospect of fully intelligent machines fills us with excitement and dread. As with any technology, there are those who are completely bent on imbuing their businesses with the newest advancements as soon as they can.

In this article, we hope to cover applications your business can use today that can make your life easier and the lives of your employees better. There is no global scheme too great for artificial intelligence.

This host of applications are purely meant to create a more efficient and productive work environment that can provide the catalyst to a new age of technological advancement in which your business can thrive.

1. Speech recognition

Speech recognition software is one of the most groundbreaking and widespread technologies to come out of the artificial intelligence industry. Just think of the personal assistants like Google Home and Amazon Alexa.

This software can improve your business life in a myriad of ways, least of all in its scheduling and task management abilities. Speech recognition is now used in place of typing as it is up to three times faster than traditional typing with only a 4% margin of error.

This creates a much more efficient flow of work at a much higher accuracy than ever before. It also allows employees to work faster without tiring themselves out as typing can physically damage the hand if done often over a long period of time.

As a matter of fact, with the use of speech recognition AI, a company could all but eliminate carpal tunnel syndrome from their employees. There is also the added benefit of employees keeping up their schedules automatically without the need of innumerable sticky notes or secretaries.

Speech recognition software can improve efficiency dramatically and provide a much more productive environment for your employees where work can be a little less grueling and a little more satisfactory.

2. Image recognition

Visual recognition software has been used for a number of years and, until recently, was a fairly flawed software. Using the advancements in neural networks and other processing systems, visual recognition has become much more accurate and can define even intricate details in an image. This can be very useful for the chief purpose of security in your workplace. With image recognition, you could verify who enters your building or who uses what computer.

This could prevent fraud and theft along with a whole host of other problems that arise when people gain access to things illegally with no real security checks. Image recognition is an invaluable resource to business owners who want to keep their employees safe and their work secure so that no malevolent force can disrupt your business’s everyday functions.

3. Malware detection

Detection skills are incredibly important in the development of artificial intelligence as they hope to reduce the amount of damage caused by cyber breaches. A few cybersecurity companies, like Deep Instinct, use artificial intelligence to automatically detect malware and remove it from your system or block it from entering your company’s intranet. It can also detect things like overheating in your server room to prevent meltdowns.

Detection software can also recognize if a user is accessing your internet illegally and block their use. Artificial intelligence is, in fact, a dream for cybersecurity as it is the only thing fast enough to keep up with the rate of cyber breaching and hacking.

There is nothing more important than keeping your files safe and out of the hands of malicious people. Artificial intelligence applications like malware detection can make sure that your information is kept private and secure.

Conclusion

For many businesses, it can take a while to integrate a new technology into their business. Though there is excitement abound, many owners don’t know how, or necessarily want, to bring AI into their workspaces. Soon, however, there really will not be a choice as technology is advancing at a rate faster than ever before.

Therefore, having artificial intelligence as well as the necessary apps in place integrated into your business will not be an option one can exclude themselves from. In these cases, it is important to stay educated and understand what can, and cannot, be integrated into your business now. Luckily, artificial intelligence comes with a whole host of applications that can make your business run better and smoother than ever before.

Artificial intelligence is an inevitable technology that is coming into force and will take over all industries someday. The goal of AI is not to replace but to disrupt and create an easier life for employees and business owners. Whether it is streamlined work or safer spaces, artificial intelligence applications are there to make your life easier and your work better.

The applications discussed above can be implemented immediately and can completely change your company’s style upon impact. It is better to be ahead of technology than constantly reacting to it.

Monday, October 21, 2019

Artificial intelligence and farmer knowledge boost smallholder maize yields

Farmers in Colombia's maize-growing region of Córdoba had seen it all: too much rain one year, a searing drought the next. Yields were down and their livelihoods hung in the balance.

The situation called for a new approach. They needed information services that would help them decide what varieties to plant, when they should sow and how they should manage their crops. A consortium formed with the government, Colombia's National Cereals and Legumes Federation (FENALCE), and big-data scientists at the International Center for Tropical Agriculture (CIAT). The researchers used big-data tools, based on the data farmers helped collect, and yields increased substantially.

The study, published in September in Global Food Security, shows how machine learning of data from multiple sources can help make farming more efficient and productive even as the climate changes.

"Today we can collect massive amounts of data, but you can't just bulk it, process it in a machine and make a decision," said Daniel Jimenez, a data scientist at CIAT and the study's lead author.

"With institutions, experts and farmers working together, we overcame difficulties and reached our goals."

During the four-year study, Jimenez and colleagues analyzed the data and verified developed guidelines for increased production. Some farmers immediately followed the guidelines, while others waited until they were verified in field trials. Farmers that adopted the full suite of machine-generated guidelines saw their yields increase from an average of 3.5 tons per hectare to more than 6 tons per hectare. This is an excellent yield for rainfed maize in the region.

The guidelines also substantially reduced fertilizer costs, and provided advice on how to reduce risks related to variation in the weather patterns, with an emphasis on reducing the negative impacts of heavy rainfall.

Researchers from FENALCE co-authored the study, which is part of a Colombian government program aimed at providing farmers with options to manage both weather variability and climate change.

If one farmer provides data to a researcher it is almost impossible to gain many insights into how to improve management," said James Cock, a co-author emeritus CIAT scientist. "On the other hand, if many farmers, each with distinct experiences, growing conditions, and management practices provide information, with the help of machine learning it is possible to deduce where and when specific management practices will work."

Year-on-year, maize yields in the study region vary by as much as 39 percent due to the weather. Small farmers in the past had to rely on their own knowledge of their crops and accept blanket recommendations often developed by researchers far removed from their own milieu. The study shows that combining farmers' knowledge with data on weather, soils and crop response to variables, farmers can, at least partially, shield their crops against climate variability and stabilize their yields at a higher level.

From farm to algorithm

In Córdoba, FENALCE, which compiles information on maize plantations, harvests, yields and costs, set up a web-based platform to collect and maintain data from individual farms. Local experts uploaded information on soils after visiting farms at various stages of the crop development, while IDEAM, Colombia's weather agency, supplied weather information from six stations in the region. This allowed researchers to match daily weather station information with individual fields and the various stages of the growing season.

The researchers used machine learning algorithms and expert analysis to measure the impact of different weather, soil conditions and farming practices on yields. For example, they noticed that improving soil drainage to reduce run-off likely reduces yields when rainfall is lower, whereas doing the same in areas with a lot of rain boosts yields. This shows advice on crops needs to be site-specific.

The study demonstrated that the amount of phosphorus applied, the seed rate, and field run-off capacity had a major impact on yield levels. Understanding the effects of the inputs on the crops allowed experts to guide small farmers towards the best practices to use in order to produce high, stable yields.

The upshot for farmers is that most of the management practices the study recommends do not require major investments, showing that food security and livelihoods can be improved—at least in this case—without major expenditures.

Human learning, too

Initially, CIAT and FENALCE designed a smartphone application for farmers to record soil and other data in the field but corn growers did not adopt the app. Although the web-based platform was used to compile the information, researchers and technical assistants had to visit the farms to help the farmers collect the data. This presents challenges for scaling up this type of exercise.

Nevertheless, researchers see opportunities for increased data collection by smallholders, both by directly working with farmers and through technology. Future projects could incorporate apps already developed and used by farmers. Furthermore, data collection by a whole array of technologies ranging from satellites, drones and low-cost sensors, deployed in fields, coupled with combine harvesters that accurately record grain yield at a micro-scale are all becoming realities in the developing world.

"Much of the hardware and software for the future collection of data may well come when the private sector becomes involved in developing sustainable systems for capturing, analyzing and distributing information," said Jimenez. "In the future we can envisage every field being carefully characterized and monitored, turning the landscape into a whole series of experiments that provide data which machine learning can interpret to help famers manage their crops better."

African Women in Tech Look to Artificial Intelligence

Artificial intelligence took center stage as African female technology experts met at Women in Tech Week in Ghana to promote women in the field. Photo by NESA by Makers on Unsplash

Artificial intelligence took center stage as African female technology experts met at Women in Tech Week in Ghana to promote women’s involvement in the field.

When Lily Edinam Botsyoe was studying computer science at a university in Ghana, students wrote programming codes on a whiteboard because there were not enough computers. This made it difficult to apply the coding skills they were learning, she says, and the problem continues today.

From Voice of America. Story by Stacey Knott.

“We have students coming out of schools having the theoretical background — which is very important because you can’t actually appreciate something practical if you don’t have the theory. But, the industry-ready skills is lacking because they didn’t have the hands-on experience,” Botsyoe said.

She wants to see more resources for students, especially for girls and women, to get practical experience in technology in Ghana and across Africa.

Today, Botsyoe is a system tester and works to mentor other women in coding and artificial intelligence.

Botsyoe presented at the Women in Tech Week in Accra, along with her colleague, data scientist Aseda Addai-Deseh.

They explained to participants what artificial intelligence is, how it works and, most importantly, how it can be used and developed by African women. Such uses include helping a community overcome a lack of health professionals, or increasing agricultural yields with automated farming.

or Addai-Deseh, the potential for Artificial intelligence in Africa is boundless.

“Africa is the next market because there are so many problems here to be solved, and when you have so many problems, you have so many opportunities,” she said.

Addai-Deseh says while more Africans are taking notice of AI, the majority of the industry is in North America, Europe and Asia, and is largely male.

She wants to see more investment into Artificial intelligence developers across the continent — especially in women.

Real estate agent Maya Yiadom was watching the two women’s presentation.

While excited about Artificial intelligence’s potential, she is also concerned about technology replacing jobs in Africa, where many nations already suffer high unemployment.

“Work as we know it is going to change and I’m not sure, millions, possibly billions of us already, how are we going to survive?” Yiadom said.

University of Artificial Intelligence launched in Abu Dhabi

Dr Sultan Ahmad Al Jaber with Professor Sir Michael Brady, Dr Kai Fu Lee, Professor Anil K. Jain, Professor Andrew Chi Chin Yao, Professor Daniela Rus and Peng Xiao addressing media at the launch of the Mohammad Bin Zayed University of Artificial Intelligence in Abu Dhabi.Image Credit: Virendra Saklani/Gulf News

Abu Dhabi: Taking another bold step in the world of artificial intelligence (AI), Abu Dhabi on Wednesday announced the opening of the world’s first dedicated AI university – Mohammad Bin Zayed University of Artificial Intelligence (MBZUAI).

Located in Masdar City with the latest state-of-the-art facilities and equipment, the university will offer both masters (two years) and PhD programmes (four years) for local and international graduate students across three main specialised fields – machine learning, computer vision and natural language processing – as the UAE looks to equip the next generation of students with the latest expertise in the field of AI.

Official applications for the university are open from this month, with registrations taking place in August of next year. The first batch of classes will start in September 2020.

How are we going to produce the right number of people with the right mindset [and] the right knowledge ... That is what this university is about — providing that person power over
5 to 10 to 20 years.

- Michael Brady, Interim president and member of MBZUAI board of trustees

“Launching of the world’s first graduate level artificial intelligence university in Abu Dhabi echoes the UAE’s pioneering spirit, and paves the way towards a new era of innovation and technological advancement that benefits the UAE and a world,” tweeted His Highness Shaikh Mohammad Bin Zayed Al Nahyan, Crown Prince of Abu Dhabi and Deputy Supreme Commander of the UAE Armed Forces.

Dr Sultan Al Jaber, UAE Minister of State and chairman of the university’s board of trustees, speaking at the official press conference opening, said the university is in line with the UAE’s strategy of leveraging the latest and most innovative technologies – namely AI.

“The world has entered a new era of technological advancement and rapid innovation, all driven and underpinned by AI. This new era will pave the way for unprecedented opportunities, AI has become a priority and evident across all industries with new technologies being introduced at an incredibly fast pace.

“The world needs more human capacity in the field of AI to bridge any possible gaps and that is why today the UAE and Abu Dhabi is announcing the launch of Mohammad Bin Zayed University of Artificial Intelligence – the world’s first graduate level research based AI university,” he added.

NAT-MBZUAI-ABU-DHABI--14-(Read-Only) — The Mohammad Bin Zayed University of Artificial Intelligence campus in Abu Dhabi. Both local and international graduate students can apply and classes will start in September 2020.Image Credit: Virendra Saklani/Gulf News

“This university will help us to develop the necessary AI eco system that will enable us to leverage the full potential of this very important technology locally, regionally and globally. The university will create an active AI community in the UAE developing innovative applications for businesses and government,” he said, highlighting what the university will be bringing to the community.

Al Jaber called MBZUAI and its concept an open invitation from the UAE to the world in unleashing the latest technological advancements, and said the university was looking forward to collaborating with the world’s foremost experts.

“The UAE clearly sees a phenomenal opportunity ahead where the UAE can demonstrate its unique capabilities in building bridges, extending help, support and collaboration.

“The art of partnership is a very clear model of engagement that the UAE has mastered over the years and this is what we’re doing yet again in a very sophisticated area of business centred around AI,” he added.

Developing human capacity

Bringing a wealth of experience from his own background in AI, robotics and imagery, Sir Michael Brady, who serves as interim president and a member of the board of trustees at MBZUAI, said the institute was a part of the UAE’s move towards a knowledge based economy, which would require the needed human capacity to ensure its success.

“This began with the government of the UAE formulating the strategy to transform the economy to the post oil era… To invest in developing competence in renewable energy, financial services, healthcare, materials technology and others.

“[One of the main] enabling technologies is AI, and then you ask what are the risks in realising that [vision]… and the answer Is people,” he added.

“[So] how are we going to produce the right number of people with the right mindset [and] the right knowledge in order to lead and provide the technical leadership in these areas. That is what this university is about – providing that person power over the next 5 to 10 to 20 years,” he said.

Mohammad Bin Zayed University of Artificial Intelligence

Where: Masdar City, Abu Dhabi
What: World’s first dedicated graduate research level university in artificial intelligence. Three main specialised fields to be taught: machine learning, computer vision, natural language processing.
Who can apply: Both local and international graduate students.
When: Applications are currently open through www.mbzuai.ac.ae. Classes to start in September 2020 with the first total batch of students yet to be determined.
Course level and duration: The university offers a Masters of Science degree which takes two years to complete, and a PhD programme which will take four years to complete.

Despite the success of reinforcement learning algorithms, there are few challenges which are still pervasive.

Rewards, which make up for much of the RL systems, are tricky to design. A smarter reward system ensures an outcome with better accuracy.

In the context of reinforcement learning, a reward is a bridge that connects the motivations of the model with that of the objective. Reward design decides the robustness of an RL system. Designing a reward function doesn’t come with much restrictions and developers are free to formulate their own functions. The challenge, however, is the chance of getting stuck in local minima.

Reward functions are peppered with clues to make the system/model/machine to move in a certain direction. The clues in this context are a bunch of mathematical expressions that are written with efficient convergence in mind.

Automating Reward Design

Machine learning practitioners, especially those who deal with reinforcement learning algorithms, encounter a common challenge of making the agent realise that certain task is more lucrative than the other. To do this, they use reward shaping.

During the course of learning, the reward is edited based on the feedback that is generated on completion of tasks. This information is used to retrain the RL policy. This process is repeated until the agent performs desirable actions.

The challenges to retrain policies and observing for long durations makes one question if reward design can be automated and if there can be a proxy reward that while promoting the learning, also meets the task objective.

In an attempt to automate the reward design, the Robotics department at Google, introduced AutoRL, a method that automates RL reward design by using evolutionary optimisation over a given objective.

To measure the effectiveness, the team at Google, applied AutoRL’s evolutionary reward search to four continuous control benchmarks from OpenAI Gym, including:

Ant
Walker2D
HumanoidStandup
Humanoid

These were applied over two RL algorithms: off-policy Soft Actor-Critic and on-policy Proximal Policy Optimisation.

To assess AutoRL’s ability to reduce reward engineering while maintaining the quality of existing metrics, the team considered task objectives and standard returns.

Task objectives measure task achievement for continuous control: distance traveled for Ant, Walker, and Humanoid, and height achieved for Stand Up. Whereas, standard returns are the metrics by which tasks are normally evaluated.

Key Findings

The authors, in their paper, list the following findings:

Evolving rewards trains better policies than hand-tuned baselines, and on complex problems outperforms hyperparameter-tuned baselines, showing a 489% gain over hyperparameter tuning on a single-task objective for SAC on the Humanoid task.
Second, the optimisation over simpler single-task objectives produces comparable results to the carefully hand-tuned standard returns, reducing the need for manual tuning of multi-objective tasks.
Lastly, under the same training budget, reward tuning produces higher-quality policies faster than tuning the learning hyperparameters. […]