Thoughts on Tesla AI Day

Tesla Bot; photo from Lex Fridman’s tweet

Tesla has done several technical deep dives over the years to give the world an update on the company’s progress in fully self-driving (FSD) and battery technology. It started with Autonomy Day, followed by Battery Day, and more recently, AI Day. Tesla uses these events to help recruit the best and brightest.

The events are also probably staged to make Tesla stock short sellers sweat profusely. Elon Musk had, in various presentations, interviews and podcasts, alluded to most things presented at the event. To see them fleshed out in details must have made critics regret calling Elon Musk a shyster.

The details provided at Tesla AI Day make it a masterclass on AI. The presentations and the Q&A will likely be dissected by AI researchers and enthusiasts for months to come. I do not profess to understand all the technical details presented but from what little I know, it’s clear that Tesla will make fully autonomous vehicles a reality sooner than expected.

Any sufficiently advanced technology is indistinguishable from magic.

Sir Arthur C. Clarke

Building AI models

Swept up in the machine learning and AI hype back in 2016, I dove into the subject matter. The best way to learn is by doing and by explaining the subject matter to an audience in simple terms. The end result was a paper (see Section 6) and a presentation at the Institute and Faculty of Actuaries on a sentiment analysis model.

To set the scene, my artificial neural networks (neural nets)-based sentiment analysis model is trained to do a simple task – classify central banks’ text communications (e.g. speeches, reports etc.) into one of three categories: “hawkish”, “neutral”, or “dovish” (i.e. central banks’ parlance for how bullish or bearish central bankers are on the economy).

Summary of the end-to-end sentiment analysis model building process

The end-to-end process of building the model can be summarised as follows:

  1. Data collection and pre-processing – To train the neural nets, I needed appropriately labelled text data. I used the Bank of England’s Monetary Policy Committee (MPC) meeting minutes because they were readily available and were labelled with the outcome i.e. if the MPC decided to reduce interest rate or increase quantitative easing (QE), the meeting minutes should reflect a dovish sentiment. I then used Natural Language Processing (NLP) techniques to process the text data.
  2. Model training and validation – I used one of the many freely-available open source neural nets algorithms to train the model. The pre-processed text was fed to the algorithm as inputs. Broadly, the algorithm works by giving the inputs weights and passing them through layers of non-linear functions (aka “neurons”), arriving at the output function which classifies text into one of three categories – hawkish, neutral, or dovish. The first iteration of the model was then tested with text the model had not seen before (i.e. out-of-sample data). The training and validation process was repeated until the classification error was acceptable.
  3. Model deployment – Once trained and validated, the model was deployed to make inferences on the sentiment in Bank of England’s regular publicly available updates and communications.

Tesla’s approach to FSD

Tesla is of course solving a problem 1000x harder and more useful than classifying text documents. Achieving fully self-driving (FSD) is part of Tesla’s master plan. Tesla is relying primarily on computer vision to make vehicles fully autonomous. Elon Musk famously said that using LIDAR is a fool’s errand (the jury is still out on whether using LIDAR is indeed foolish).

Tesla’s end-to-end process – from collecting the data, training and validating the neural nets, to deploying the neural nets – is nothing short of impressive. The main takeaway I have from the event is that Tesla’s technology stack enables the team to iterate quickly to improve the performance of the FSD system. Fully self-driving vehicles may be a reality sooner than expected.

Data collection

Data is the be-all and end-all for AI models. Each Tesla vehicle on the road (the “fleet”) is equipped with eight cameras (less in older vehicles). The images and videos collected from these cars (and anonymised) are Tesla’s primary source of training data.

Key points from the AI Day event:

  • One of the reasons LIDAR is not required for Tesla to achieve FSD is due to the superiority of the dataset at its disposal. For example, to train the neural nets to “see” through fog, Tesla can request images and videos of driving in foggy situations from the fleet;
  • Tesla has built a scalable solution to label and annotate the vast amount of data it has. For example, it can now automatically label traffic cones among many other objects;
  • For edge cases with limited real-life data, Tesla uses simulation to train the neural nets. These simulations involve letting the neural nets drive in a video game. The video game has super photorealistic graphics to avoid “over-fitting” (i.e. a common challenge whereby the neural nets perform well when using in-sample data but poorly when presented with out-of-sample data). One of the presenters at the AI Day event claimed that Tesla’s photorealistic simulation is better than the impressive research results presented in this academic paper;
  • The trained model is deployed to the fleet in “shadow” mode. This is one of a number of ways Tesla evaluates and validates the model’s performance. The driver acting differently from the software (for example, turning right when the model wants to turn left) provides valuable model validation data.

Model design and architecture

Tesla’s FSD architecture; source: Tesla

Tesla uses its proprietary model design and architecture to translate video data into a car that can navigate the world around it. The model comprises of multiple neural nets, each with a different task.

Key points from the AI Day event:

  • The raw video data from the eight cameras are processed by neural nets to extract salient features and information. The most useful information is transformed into a “vector space”, a 3D representation/model of the world around the car. This vector space is labelled and annotated before being used to train the neural nets responsible for driving the car;
  • To imbue the car with “memory” so that it “remembers” objects hidden from its current view, the model is fed images from the past – in space and time – every 1 meter and every 27 milliseconds.

Model training and validation

Tesla’s numerous neural nets currently take 70,000 GPU hours to train. Training neural nets is computing power intensive; more so when the training data is in a video format. Tesla is developing its proprietary Dojo supercomputer to provide the computing firepower needed to regularly retrain and improve the performance of the neural nets.

Key points from the event:

  • The Dojo D1 chip will be modular and scalable; it’s designed to be stacked to increase computing power as needed. While not explicitly mentioned at the event, the Dojo supercomputer could potentially be offered as an AI model training-as-a-service, in direct competition with Amazon Web Services (AWS);
  • To validate the retrained model quickly, Tesla has developed custom tools to compare and evaluate models.

Model deployment

Once the model is trained and validated, it’s deployed via over-the-air (OTA) software update to the fleet. Newer cars are equipped with Tesla’s proprietary Full Self Driving (FSD) Computer (older cars have the Nvidia chip) to make inferences using the neural nets so as to output commands to the vehicle. The FSD Computer is designed from the ground-up, optimised for low power consumption and low latency. A deep-dive into the FSD Computer was done at the Autonomy Day event.

One last thing

Elon Musk prefaced at the start of the event that Tesla “is much more than just an electric car company” and that it’s “arguably the leader in real-world AI” applications. Tesla wants to use its proprietary AI software and hardware to build applications beyond fully self-driving vehicles. One such application that Elon has in mind is the humanoid Tesla Bot.

Would such a robot make economical sense for the company, someone asked during the Q&A, given that the Tesla Bot’s use case (i.e. repetitive and boring labour) is usually not highly compensated? To which Elon curtly replied, “Well, we’ll just have to see“.

As always, the rule of thumb is to NOT bet against Elon Musk. There was a moment during the Q&A when Elon explained why heat is a problem for the Dojo supercomputer; he explained using first principles and ended by quoting the relevant physics formula. That is why he’s the TechnoKing of Tesla.

The bottom line: Tesla’s masterplan is to build fully self-driving (FSD) vehicles. To that end, Tesla has amassed a vast amount of training data, and has built proprietary software and hardware to train, validate and deploy neural networks (neural nets). Its technology stack enables Tesla to iterate quickly to improve the performance of its FSD system. Fully autonomous vehicles may be a reality sooner than expected.

Disclaimer and disclosure: This is my personal view as a long-time Tesla fan. It is NOT investment advice/recommendation. I write on this blog in my personal capacity; my opinions are NOT endorsed by my employer or the actuarial profession. I am financially and emotionally invested in Tesla’s success.

Tesla referral link: If you drive regularly, an electric car is what you need. That electric car should ideally be a Tesla. Get 1,000 miles free Supercharging by using my referral link: