Arm Virtual Hardware
Qeexo has been invited by Arm to integrate the Arm Virtual Hardware (AVH) Platform ‘devices’ into Qeexo AutoML. The Arm Virtual Hardware Platform is a cloud-based hardware simulator of the Arm Cortex-M55 MCU and the Ethos-U55 microNPU Machine Learning Processor designs available from Arm through the AWS Marketplace. Qeexo users can apply Machine Learning functions with these new virtual ‘device’.
In this article:
B. Creating and Managing Project
D. Training data, Test data and Data operation
E. Viewing and managing project data
Building Machine Learning Models
E. Test model performance on Test Data
Note, Arm Virtual Hardware is a virtual hardware, so sensor installation is NOT needed.
Prerequisites
Enable microphone on your browser
Working with Projects
A. What is Project?
"Project" is the basic unit of organization in the Qeexo AutoML system. A Project represents a collection of work to solve a specific machine learning problem on a particular target hardware (here in this article, we are using a virtual hardware).
You DO NOT need to install any hardware (sense module) to be able to create a project as AVH is virtual.
An example of a Project might be something like "Turbine-Predictive-Maintenance", where the data, models, and tests are compiled with the end goal of using machine learning for predictive maintenance on Arduino Nano 33 BLE devices attached to turbines.
B. Creating and Managing Project
To help you better understand the guide, we are presenting a demo project - “Demo-AVH” - to demonstrates every step. Please follow the IN DEMO CASE sign along articles.
1. As a new user, you will be taken to the “Create Project” page after logging in, where you can specify a “Project Name”, “Classification Type”, and the “Target Hardware”.
a. Project Name: Enter a name that is reflective of the purpose of your project.
IN DEMO CASE we name it as “Demo-AVH”.
b. Classification Type: Choose “Multi-Class Classification”.
For current AutoML version, Arm Virtual Hardware is ONLY supported by “Multi-Class Classification“.
Click the link to get more information about different type of classification.
IN DEMO CASE we select “Multi-Class Classification”.
c. Target Hardware: Select the hardware that will be used in your project which will be “ARM Virtual Hardware” for this article.
IN DEMO CASE we select “ARM Virtual Hardware”.

2. After the selection, click CREATE
button to create you project.
Now you have successfully created your AVH project!
Data Management
Please refer this page for Arm Virtual Hardware Project Best Practices.
A. Collecting data
Navigate to the Data Collection page to collect data using the Qeexo AutoML web app. This can be done either by clicking the COLLECT TRAINING DATA
button or the DATA COLLECTION
tab. Then you will be navigated to data collection page.


Step 1: Build Environment
*What is Environment?
An Environment is a physical setting with a given set of properties (e.g. acoustics, temperature, lighting). The range of this set of properties should match the range of the environment where the final machine learning model will eventually run. For example, training the machine learning models with data in your office will likely not work very well once you test the trained models on the factory floor. Environments also contain information about the given sensor configuration settings. All data collected for a given Environment will have the same sensor configuration.

You can either BUILD NEW ENVIRONMENT
by entering a unique "Environment Name", or SELECT AN ENVIRONMENT
to add more data to a previously recorded Environment. If selecting an existing Environment, the Sensor Configuration (in Step 2: Configure Sensors) will automatically populate with the Environment's previous settings. You should name your Environment something easily recognizable to you, with details about the specific location. For example, "OfficeCoffeetable" or "VestasTurbineSolano".
IN DEMO CASE we name the environment as “Office1” as the model training is done in an office environment. Once you input the name, click SAVE
to proceed to the next step.
Step 2: Configure Sensors
Click EDIT
in Step 2: Configure Sensors to view the list of the supported sensors on the Target Hardware. After selecting the sensors, you will need to configure the corresponding sampling rate - ODR (Output Data Rate) for each sensor and the Full Scale Range (FSR) when available.
Currently AVH only supports Microphone sensor with a corresponding sampling rate(Output Data Range or ODR) of 16000 Hz.

After selecting the Microphone sensor, click on USE SENSOR CONFIG
to save the sensor configurations for your AVH project.
Step 3: Collect Data
Qeexo AutoML currently supports a variety of supervised classification algorithms for machine learning. For each of these algorithms, all data used for training must have an associated Class Label(aka, COLLECTION NAME).
*For multi-class, at least two unique classes must be defined. For most problems, we recommend that at least one of the classes be a "baseline" class that represents the typical environmental noise or behavior.
Whether or not baseline data is necessary depends on the use case and data selected. In general, the classes collected for multi-class classification should represent the full set of possible states for the given Environment. For example, if you want to build a multi-class model which can recognize various types of keywords/speech(e.g. Yes, No), you should also collect data that represents a baseline class(e.g. Silence).
Baseline Data:
Baseline data can be collected by setting the data type to Continuous and leaving data collection application to run while the environment is in a steady state of rest or typical operating behavior.
Some machine learning problems require collecting baseline data to differentiate events of interest from normal environmental conditions.
Baseline data is usually associated with each Environment (since different Environments will often have different baseline data characteristics).
For example, baseline data might be "NoGesture" in gesture recognition, "None" in kitchen appliance detection, or "AtRest" in logistics monitoring, “Silence” in speech recognition.
Class Label / Collection Label:
A Class Label is a machine learning concept, normally a word or phrase to label the event or condition of interest. For example, "Yes", "No", and "Silence" can be classes in our Speech Recognition Project.
For Continuous data, the Class Label applies to all of the data collected.
You must define one Class Label at a time when collecting data by entering a text string in the given field.
Note that only alphabets, numbers, and underscores are allowed when naming class labels in MLC projects.
Number of Seconds
This sets the duration of the data collection.
More data generally leads to higher performance. Depending on the complexity of the use case, the number of classes, the quality of the data, and many other factors, the optimal and minimum number of seconds to collect can vary greatly. We recommend starting with at least 30 seconds for each Class Label, but much more data may be required if the classes are highly variable or if the problem is sufficiently complex.
IN DEMO CASE we are going to create 3 class labels which are “Yes”, “No” and “Silence”. “Silence” is our baseline class meaning data collected when the sensor doesn't sense any words. Move into the next section to see how to record data for each of the class label.
*Note you need to repeat step 3 for each label’s data recording.

3-1 Recording data
After completing the previous steps, the RECORD
button should now become click-able (shown as green part in the picture above). If it is not, check previous steps.
After clicking RECORD
, you will be directed to the Data Recording page, where you will see a prompt for “Attention”. This prompt is to remind you to position yourself directly in front of your computer microphone at a convenient distance when performing data collection and Live Replay. This helps in improving the model’s performance. Once ready, please go ahead and click CONFIRM
.

You will now see the following screen, which means you are now all set to record data. When you are ready to start data collection, click START
to begin. The text in the center cycle will change from “READY” to “INITIALIZE” while the data collection software is starting up.

After a few seconds, data collection will start when you see the circle turn green and display "GO". Data is now being collected. As you say “Yes” into the microphone, the center cycle will have a “✓“ sign.
Once the specific “number of seconds” have been collected, the labeled data will be uploaded to the database, and user will redirected to the Data Collection page.

You can collect more data of the same of different Class Label from the Data Collection page.
*Note that for a multi-class classification project, you will need at least 2 distinct classes (aka, 2 different Class Labels) to be able to train machine learning model.
IN DEMO CASE we will be creating 3 Class Labels in total, which are “Yes”, “No”, and “Silence”. You need to go through the process of Collect Data and Recording data for at least two more times for the two remaining classes (“Yes” and “Silence”).
The final result should look like below:

IN DEMO CASE Note, the SILENCE data will yield a WARNING for DATA CHECK. It is okay to continue proceed.
3-2 Re-recording data
If you believe a mistake has been made when recording data, and the data has been contaminated, you can re-record the data from the bottom of the Data Collection page. You can click "Re-Record" to overwrite the existing data. Alternatively, you can click on the Trash icon to delete the Dataset and start over.

B. Uploading dataset
From the Data page, you may also upload previously-collected datasets to AutoML directly. These uploaded datasets can be used to train machine learning models.
*Note: Data with the same Class Label must be of the same data type (Event or Continuous).
Click UPLOAD TRAINING DATASET
to upload .csv file(s). Each .csv should contain one or more data collections. All data contained in the .csv file must come from the same sensor configuration, which you will enter after uploading the .csv file. If you have more than 70 MB of data, you will need to split it into multiple .csv files. Please refer this link for Qeexo AutoML-defined data format.
Select Build an environment
, then put down an ENVIRONMENT NAME that is relevant to you dataset.
Then click CHOOSE FILE(S)
to select the dataset that you would like to upload. Click NEXT
to proceed.

*Note, Qeexo AutoML allows you to upload up to 10 files at a time with a maximum size of 70MB each. If you have over 10 files to upload, please upload multiple times to complete. For the first time, you may “Build an environment”. If you want to add more files to you existing environment, you may click “Select an environment” → select the existing environment to upload more files.

Then AutoML will verify you data, Click SAVE
to start the uploading process.
*Note, the sample(audio) dataset is in big size which make take some time to finish uploading, please be patient waiting.


Once the process complete, it will jump to Data page where you can view and manage your uploaded data.

C. Data check
Data check verifies the quality of the data, whether uploaded or collected. A failure in data check will not prevent you from using the data to train machine learning models. However, poor data quality may result in poor model performance.
Qeexo AutoML currently looks for the following data issues:
collected data does not match the selected sensors in the Sensor Configuration step
collected data does not match the selected sampling rate in the Sensor Configuration step
collected data contain duplicate or missing timestamps
collected data has duplicate values or constant values
collected data contains invalid values including NaN or inf
collected data is saturated
Here is an example of a data check with warnings:
A green PASS icon indicates that data check has passed;
A yellow WARNING icon indicates that the data contains one or more issues from the list above;
A red ERROR icon indicates that something went wrong during data collection or during data check (connection error or device error), the data may not be usable if it remains ERROR after refresh.
D. Training data, Test data and Data operation
Click the link for more information about Training data, Test data and Data operations
E. Viewing and managing project data
All of the Datasets associated with the current Project can be viewed and managed from the Data page. You can review the Dataset Information including its Sensor Configurations and Data Check results, as well as visualize and delete them.

F. Visualizing data
AutoML provides users with the ability to plot and view sensor data directly from the platform using the onboard data visualization tool. To visualize training or test data, click the data visualization icon.

Navigate data using scroll, scale, and zoom options and view data in either Time Domain
or Frequency Domain
.
Time Domain
visualization is a visual representation of the signal’s amplitude and how it changes over time. With time domain visualization, the x-axis represents time, whereas y-axis represents the signals amplitude.Frequency Domain
visualization, also known as spectrogram frequency visualization is a visual representation of the spectrum of frequencies of a signal as it varies with time. With spectrogram frequency visualization, the x-axis represents time, whereas y-axis represents the signal’s frequency.
Building Machine Learning Models
A. Getting started
Navigate to the Data page to build machine learning models with uploaded training data.
Select training datasets that you want to use for building machine learning models by clicking the checkbox at the left of each Datasets.
*Note that the selected Datasets should ideally be from the same Environment, but Qeexo AutoML will allow you to train Datasets from different Environments as long as the selected sensors and Sensor Configuration are identical.
IN DEMO CASE we select all 19 dataset that we uploaded from ‘train’ folder.Once the desired Datasets are selected, click
START NEW TRAINING
button to configure Training Settings.
*Note that theSTART NEW TRAINING
button is only clickable when Datasets containing 2 or more Class Labels are selected in Multi-class classification. However, for One-class classification, the button becomes clickable as soon as the one Class Label has been selected.
B. Training settings
Step 1: Group labels
This step is an optional step in case you want to group together multiple Class Labels into one Class Label before training the model.
This is an optional step that can be bypassed by pressing the SKIP
button.
For example, for a single-class classification project applied to anomaly detection, you may have machinery data that is labelled based on two different types of motion: vertical rotation (UPDOWN) and horizontal rotation (LEFTRIGHT). Since both of these classes are expected behavior, it is convenient to group these labels as a "Normal" group to feed into single-class classification.

IN DEMO CASE we will skip Group Labels step as we don’t need to.
Step 2: Model Selection & Settings
This page is for you to select which model type(s) are trained. We are going to discuss steps with respect of Multi-Class Classification Classification type ONLY as Arm Virtual Hardware project only supported by Multi-Class Classification.
Please click the link for Model Selection&Setting procedure of all other “Classification Types”.
(1) Algorithm Selection
For Multi-Class Classification, Qeexo AutoML supports the following machine learning algorithms
*Selecting more than one type of algorithm is recommended, so that results could be compared.
Support for additional algorithms will be added in the future

*Note, Neural Networks models may take longer to train, due to the significant computation required for the training process.
*Additional - the CONFIGURE
button
Note: many of these parameters interact with each other in unique and non-intuitive ways. Unless you have significant experience tuning deep learning models, you may want to consider using the automatic hyperparameter optimization tool.
Pressing CONFIGURE (available for some models) will yield the following configuration screen:


Quantization denotes an option to conduct quantization - aware training so as to achieve model size reduction.
There are additional configurable options to fine tune the ANN model
Similarly there are configurable options to fine tune the CNN model
Configuration sub-menu for other algorithms will be added in the future
Select the algorithm(s) you want to train the model by clicking the Switch
button. You can chose one or more algorithms.
Then click NEXT
to proceed to Model Settings.
IN DEMO CASE We are going to select 4 algorithms - GBM, ANN, SVM and DT, then click NEXT
to proceed to Model Settings page.
(2) Model Settings
There are two parts in Model Settings page for you to select and input information which are Generate Learning Curve(s) and Hyperparameter Tuning.

- Generate Learning Curve(s)
If enabled, this option will produce learning curves for the given data set. Learning curves visualize how your model is improving as more data is added. These curves can be extrapolated, which can be useful for determining if the model may benefit from additional data collection.
As shown in the example below, the "Circle" and "Punch" gestures are still improving with additional data. It is likely that they would continue to improve if more data is collected.
*Note: If the dataset that is used for training is very small, the learning curves may not be accurate. The model may be very good at classifying the limited data it's seen, but might not generalize to new cases. In that case, even if the learning curve does not show it, it is safe to assume that final model performance will improve with additional data collection.

- Hyperparameter Tuning
Hyperparameters are a set of adjustable parameters of machine learning models. These parameters affect the accuracy, runtime, and size of machine learning models. Different models have different parameters depending on the model architecture. AutoML provides built-in option for tuning these hyperparameters. There is a simply switch users need to flip if hyperparameter optimization is desired. If this option is enabled, AutoML tunes hyperparameter using a collection of optimization techniques tailored to TinyML applications. It maximizes accuracy while it ensures that all resource usages are under constraints (e.g., firmware binary size and memory usage). This option will often improve final model accuracy at the expense of additional runtime for model-building.
There are three settings that affect the duration of the hyperparameter tuning stage:
- Optimizer Time Limit:
- Optimizer Number of Trials:
- Optimizer Error Threshold:
Once you are ready, click START TRAINING
to proceed to Training Process.
IN DEMO CASE For Model Settings, we will leave everything as default, simply click START TRAINING
to proceed.
C. Training process
Once you clicked START TRAINING
with one or more selected machine learning algorithms, the training process will begin.
Real-Time Training Progress pops up after training begins. The top row shows the progress of common tasks (e.g. featurization, data cropping, etc.) shared between different algorithms, followed by the build progress of each of the selected models.
At the end of the training process, Qeexo AutoML will flash, in sequence, each of the built models to the hardware device to test and measure the average latency for performing classifications.
IN DEMO CASE Note the demo may take up to hours to train model. Please be patient.

D. Training result
Click TRAINING RESULT
to navigate to the Models page (also reachable from the top navigation bar), where all of the previous trainings will be listed, with the most recent one on top.
The current training will be expanded to show relevant information about model performance, including ML MODEL (the type of machine learning model), CROSS VALIDATION accuracy, LATENCY, SIZE, and additional PERFORMANCE SUMMARY. It also allows you to SAVE each model to your computer, PUSH TO HARDWARE (push a selected model to Target Hardware for LIVE TEST), LIVE CLASSIFICATION ANALYSIS and DELETE the model.
E. Test model performance on Test Data
You can test all your ML models' performance by using the uploaded test data (if you have one).

Click the button under EDIT TEST DATA
, a Model Information window will pop out.

A window will pop out. Select the test data that you need, then click SAVE
.
*Note: you can click the button under Training Data
to select labels for each Test Data as shown in the screenshot below.

The Test data Evaluation then starts.
Once the evaluation is completed, you can find the result of Model performance on Test Data by clicking the buttons under PERFORMANCE SUMMARY
of each model.
F. Live Reply
From the Test Result Page, you can select a model out of your interest, click LIVE REPLAY
button to test your model on live. Once you clicked, it will take you to Live Testing page.
Here in the Live Testing page, you can record a up to 5 seconds audio by clicking START
. Note that you don’t need to wait until the 5th second runs up, you can click STOP
whenever you finished recording your audio.
Then click ANALYZE
to analyze the data you just recorded.



IN DEMO CASE Click START
, and say the word "Yes", Then click STOP
-> ANALYZE
to proceed.
Users can check recorded data at Data → Test Data page