Skip to content

This repository contains the implementation code for InTec (Integrated Things-Edge Computing), a novel framework for distributing machine learning (ML) pipelines across Things, Edge, and Cloud layers in IoT environments.

License

Notifications You must be signed in to change notification settings

IDASLab/InTec_Framework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

InTeC Edge-AI Framework

The InTeC Framework is a powerful Edge-AI framework designed to optimize machine learning execution across IoT layers (Things, Edge, Cloud). It focuses on task distribution, including:

  • Validation & Preprocessing
  • Model Training & Analysis
  • Thing-based AI Model Deployment for Faster Responses

πŸ“„ Published Paper: Springer Journal of Computing
πŸ“‘ Preprint Available: arXiv Preprint


πŸ“– Table of Contents

  1. Introduction

  2. InTeC Architecture & Pipeline

  3. Installation and Setup

  4. Edge Service API Documentation

  5. Configuring Environment Variables

  6. License


Introduction

The InTeC Framework addresses the challenge of increasing latency and network traffic in responding to IoT data inferences. It introduces a comprehensive framework designed to distribute all components of the machine learning pipeline across the Things, Edge, and Cloud layers. This includes tasks such as validation, preprocessing, model training, model analysis, and model deployment, with a particular focus on distributing model deployment to the "Things" layer. By doing so, the framework aims to enhance the system's response time.

To combat the growing data analysis processing load at the edge layer, the framework offers a solution by distributing model deployment to the "Things" layer, assuming a minimum processing power for these IoT devices. The core hypothesis of this research is that by implementing a distributed machine learning pipeline technique within the Things-Edge-Cloud architectureβ€”where learning models are deployed on each IoT device to analyze data generated by sensorsβ€”it can significantly reduce system latency, network traffic, throughput, and edge energy consumption.

The InTeC implementations are evaluated and compared against related works in the context of solving the Human Activity Recognition (HAR) problem within a smart home environment. This is accomplished using the MHEALTH dataset as the primary data source. The framework's performance in HAR tasks is assessed and compared to existing solutions.

Why InTeC?

  • Distributed ML Processing: Validates, Preprocesses, Trains, and Deploys models across Things-Edge-Cloud architecture.
  • Low Latency & Network Overhead: Reduces system latency, bandwidth usage, and energy consumption.
  • Use Case - Human Activity Recognition (HAR): Tested using MHEALTH dataset in a smart home environment.

InTeC Architecture & Pipeline

The Things-Edge-Cloud architecture ensures efficient machine learning pipeline deployment.

Layer Role
Things Layer Deploys ML models directly on IoT devices for local inference.
Edge Layer Prepares data via preprocessing, outlier detection, and dimensionality reduction before cloud transmission.
Cloud Layer Trains and updates ML models while managing large-scale analytics.

Diagrams

InTeC Pipeline
InTeC Framework Pipeline

InTeC System Architecture
InTeC Framework Architecture

Installation and Setup

Prerequisites

Ensure you have:

πŸ›  Optional (Recommended for Docker Management)

  • Portainer β†’ A web-based UI for managing Docker containers, images, and volumes.
    • Install with:
      docker volume create portainer_data
      docker run -d -p 9000:9000 --name portainer --restart=always \
        -v /var/run/docker.sock:/var/run/docker.sock \
        -v portainer_data:/data portainer/portainer-ce
    • Access Portainer UI at: http://localhost:9000

Clone the Repository

git clone https://github.com/IDASLab/InTec_Framework.git
cd InTec_Framework

Extract Sensor Data

The sensor dataset is compressed (.zip) inside the sensor/ directory to reduce repository size. Before running the services, you need to unzip the dataset:

cd sensor
unzip data.zip -d .
cd ..

This extracts sensor data into the sensor/data/ directory.

Build and Start Services

docker-compose up -d --build

This launches:

  • βœ… Edge Service (intec-edge-service)
  • βœ… Edge Analysis (intec-edge-analysis)
  • βœ… Sensor Node (intec-sensor)
  • βœ… EMQX Broker (intec-emqx-broker)
  • βœ… MongoDB (intec-mongo-db)

Edge Service API Documentation

πŸ“Œ Base URL

Environment URL
Local http://localhost:1010
Network Accessible http://<HOST_IP>:1010
Dockerized Uses HTTP_PORT from .env (default: 1010)

πŸ“Œ API Endpoints

Endpoint Method Description
/ GET Returns a welcome message.
/sensors GET Retrieves a list of all unique sensor device names.
/sensors/:deviceName GET Retrieves all sensor data for a given device.
/sensors/:deviceName/latest GET Retrieves the latest recorded data for a given device.

Configure Environment Variables

In InTeC, different services have separate environment configurations to manage their respective functions efficiently.

Each component loads its .env variables separately, while the sensor configuration is handled directly inside docker-compose.yml.

The InTeC system is modular, with each component having its own environment configuration.

Component Configuration Method Key Settings
Analysis Core edge/analysis_core/.env Inference, Outlier Detection, Dimensionality Reduction, MongoDB, MQTT (Sensor & Cloud)
Service Core edge/service_core/.env HTTP Port, MongoDB Connection
Sensor Node docker-compose.yml MQTT Broker, Topic, Data Collection Rate, Subject (MHEALTH dataset), Runtime Duration

πŸ“Œ Analysis Core

The Analysis Core is responsible for data inference, outlier detection, dimensionality reduction, and publishing processed data.

  • Inference Model Configuration: Enables or disables inference models.
  • Outlier Detection: Configures the outlier detection model and drop rate.
  • Dimensionality Reduction: Enables PCA/AE to reduce sensor data size.
  • MQTT Config: Defines MQTT brokers and topics for data processing.
  • MongoDB Config: Stores processed sensor data.
  • Logging Config: Controls the log level.

πŸ“„ Modify edge/analysis_core/.env

# πŸš€ General Settings
CLIENT_ID=Edge_UB01

# 🏷️ Inference Model Configuration
INFERENCE_ENABLE=False
INFERENCE_MODEL=CNN_LSTM  # Options: CNN, LSTM, CNN_LSTM, FFNN
SLIDING_WINDOW_SIZE=25  # Options: 25, 50, 100

# πŸ” Outlier Detection Configuration
OUTLIER_ENABLE=True
OUTLIER_MODEL=IsolationForest  # Options: IsolationForest
OUTLIER_DROP_RATE=80

# πŸ“‰ Dimensionality Reduction Configuration
REDUCTION_ENABLE=True
REDUCTION_MODEL=PCA  # Options: PCA, AE

# 🌐 Sensor MQTT Broker (For Incoming Sensor Data)
SENSOR_MQTT_BROKER=intec-emqx-broker
SENSOR_MQTT_PORT=1883
SENSOR_MQTT_TOPIC=sensor/data

# ☁️ Cloud MQTT Broker (For Processed Data)
CLOUD_MQTT_BROKER=intec-emqx-broker
CLOUD_MQTT_PORT=1883
CLOUD_MQTT_TOPIC=cloud/data
TRAINING_MQTT_TOPIC=cloud/data

# 🌐 Cloud Data Sync Interval
CLOUD_SYNC_PERIOD=1 # Sync every 15 minutes

# πŸ›’οΈ MongoDB Configuration
DB_URL=mongodb://user:password@intec-mongo-db:27017/edge?authSource=admin
DB_COLLECTION=sensors

# βš™οΈ Logging & Debugging
LOG_LEVEL=INFO

πŸ“Œ Service Core

The Service Core is responsible for handling HTTP requests, storing data, and managing communication with MongoDB.

  • HTTP Config: Defines the port where the service will be accessible.
  • MongoDB Config: Specifies the database connection details.

πŸ“„ Modify edge/service_core/.env

CLIENT_ID=Edge_UB01

# 🌐 HTTP Configuration
HTTP_PORT=1010  # Defines the exposed port for service communication

# πŸ›’οΈ MongoDB Configuration
DB_URI=mongodb://user:password@intec-mongo-db:27017/edge?authSource=admin

πŸ“Œ Sensor Node

The Sensor Node acts as a simulated IoT device that publishes sensor data to the MQTT broker.

  • Name: The identifier of the sensor.
  • Subject: The category of data it publishes. The sensor contains 10 subjects from the MHEALTH dataset, each representing a different individual's recorded physiological signals, such as heart rate, acceleration, and body movements.
  • MQTT Broker: The MQTT server address.
  • Topic: The MQTT topic where sensor data is published.
  • WindowSize: Defines how much sensor data is collected before sending.
  • Rate: The sampling rate of the sensor.
  • Time: Specifies the sensor runtime duration, determining how long the sensor will continuously publish data before stopping.

πŸ“„ Modify docker-compose.yml

  intec-sensor:
    environment:
      - Name=sensor01
      - Subject=subject1
      - Broker=intec-emqx-broker  # Uses EMQX broker inside Docker
      - Topic=sensor/data
      - WindowSize=25
      - Rate=50
      - Time=60

License

This project is licensed under MIT License.

About

This repository contains the implementation code for InTec (Integrated Things-Edge Computing), a novel framework for distributing machine learning (ML) pipelines across Things, Edge, and Cloud layers in IoT environments.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages