Building a Python Project with Hugging Face Translation Model Using FastAPI and React JS Frontend: Performance Analysis and Estimation
Quick summary:
This article outlines the process of creating a client (using React JS, Vite, and TypeScript) and a server (utilizing Python FastAPI, the Transformers library, and the Helsinki-NLP translation model) to demonstrate and analyze the performance of the translation model in real-time applications.
About model:
The Helsinki-NLP translation models are part of the Hugging Face model hub, created by the University of Helsinki’s Natural Language Processing (NLP) group. These models are based on the MarianMT framework, a highly efficient machine translation model designed for multiple languages.
These models, like opus-mt-en-fr (English to French) or opus-mt-fr-en (French to English), are trained on the OPUS corpus, a multilingual dataset containing parallel text data from various sources. They support translations between many languages, especially low-resource ones, and are pre-trained using a transformer architecture.
Key aspects:
- Wide language support: Can handle many language pairs, including low-resource languages.
- Performance: Optimized for translation tasks in real-time applications.
- Model availability: These models are available for free on the Hugging Face Model Hub, where users can easily integrate them into their projects using libraries like
transformers
.
These models are commonly used in academic research, multilingual NLP applications, and real-time translation systems.
For more details, you can explore the models on the Hugging Face platform: Hugging Face Helsinki-NLP models.
Let’s start with FastAPI backend (server side):
mkdir huggingface_translation
cd huggingface_translation
Create a virtual environment:
python3 -m venv venv
source venv/bin/activate
Install the required packages:
pip install transformers torch sentencepiece sacremoses psutil
Now, let’s create a simple Python script that uses a Hugging Face translation model with some additional performance metrics (memory and execution time):
from fastapi import FastAPI
from pydantic import BaseModel
from fastapi.concurrency import run_in_threadpool
import torch
import psutil
import os
import time
from transformers import pipeline
app = FastAPI()
def get_size(bytes, suffix="B"):
factor = 1024
for unit in ["", "K", "M", "G", "T", "P"]:
if bytes < factor:
return f"{bytes:.2f}{unit}{suffix}"
bytes /= factor
def translate(text, src_lang="en", tgt_lang="fr"):
device = 0 if torch.cuda.is_available() else -1
start_time = time.time()
if torch.cuda.is_available():
torch.cuda.reset_peak_memory_stats()
start_mem = torch.cuda.memory_allocated()
else:
process = psutil.Process(os.getpid())
start_mem = process.memory_info().rss
translator = pipeline("translation", model=f"Helsinki-NLP/opus-mt-{src_lang}-{tgt_lang}", device=device)
translated = translator(text, max_length=40)[0]['translation_text']
end_time = time.time()
if torch.cuda.is_available():
end_mem = torch.cuda.max_memory_allocated()
mem_diff = end_mem - start_mem
else:
end_mem = process.memory_info().rss
mem_diff = end_mem - start_mem
computation_time = end_time - start_time
return translated, get_size(mem_diff), computation_time
class TranslationRequest(BaseModel):
text: str
src_lang: str = "en"
tgt_lang: str = "fr"
@app.post("/translate")
async def translate_text(req: TranslationRequest):
translated_text, memory_used, comp_time = await run_in_threadpool(translate, req.text, req.src_lang, req.tgt_lang)
response = {
"original_text": req.text,
"translated_text": translated_text,
"memory_used": memory_used,
"computation_time": f"{comp_time:.2f} seconds",
"device": "GPU" if torch.cuda.is_available() else "CPU"
}
return response
This code demonstrates a FastAPI-based backend service that translates text using the Hugging Face transformers library. It implements the Helsinki-NLP translation models, allowing translation between different language pairs. Here’s a breakdown:
- Memory Tracking and Performance Monitoring: The
translate
function calculates memory usage and execution time for the translation operation, supporting both GPU and CPU processing. - FastAPI POST Route: The
/translate
endpoint accepts a POST request withtext
,src_lang
, andtgt_lang
. It calls the translation function, logs memory used, translation time, and whether the operation used GPU or CPU. - Dependencies: Includes
torch
for GPU computation,psutil
for memory usage stats, andtransformers
for the translation pipeline. run_in_threadpool
: This method is used to execute synchronous functions within an asynchronous context, without blocking the event loop. It moves the CPU-boundtranslate
function to a separate thread.- Async Route: The route remains asynchronous, allowing FastAPI to continue handling other requests while the translation runs.This API is optimized for performance tracking, with data being returned to clients about the translation, computation time, and device used (GPU/CPU).
To run the backend:
uvicorn huggingface_translation:app --reload
Let’s proceed with Frontend (TypeScript) Client side:
npm create vite@latest my-react-app -- --template react-ts
cd my-react-app
npm install
Update App.tsx (in src directory)
import { useState } from 'react';
import './App.css';
// Define the interface for the request data structure
interface TranslateRequest {
text: string;
src_lang: string;
tgt_lang: string;
}
// Define an array of language options
const languageOptions = [
{ code: 'en', name: 'English' },
{ code: 'fr', name: 'French' },
{ code: 'es', name: 'Spanish' },
{ code: 'de', name: 'German' },
{ code: 'zh', name: 'Chinese' },
{ code: 'ru', name: 'Russian' },
// Add more languages as needed
];
const App = () => {
const [text, setText] = useState<string>('');
const [srcLang, setSrcLang] = useState<string>('en'); // State for source language
const [tgtLang, setTgtLang] = useState<string>('fr'); // State for target language
const [translated, setTranslated] = useState<string>('');
const handleSubmit = async (event: React.FormEvent) => {
event.preventDefault();
console.log("Form submitted with text:", text); // Log the input text
// Create the request data based on the interface
const requestData: TranslateRequest = {
text,
src_lang: srcLang, // Use the source language from state
tgt_lang: tgtLang, // Use the target language from state
};
const response = await fetch('/api/translate', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify(requestData), // Use requestData here
});
if (!response.ok) {
console.error("Error in response:", response); // Log response if not ok
return;
}
const data = await response.json();
console.log("Translated text received:", data.translated_text);
setTranslated(data.translated_text);
};
return (
<div className="App">
<form onSubmit={handleSubmit}>
<input
type="text"
value={text}
onChange={(e) => setText(e.target.value)}
placeholder="Enter text to translate"
/>
<br></br>
{/* Dropdown for source language selection */}
From: <select value={srcLang} onChange={(e) => setSrcLang(e.target.value)}>
{languageOptions.map((lang) => (
<option key={lang.code} value={lang.code}>
{lang.name}
</option>
))}
</select>
<br></br>
{/* Dropdown for target language selection */}
To: <select value={tgtLang} onChange={(e) => setTgtLang(e.target.value)}>
{languageOptions.map((lang) => (
<option key={lang.code} value={lang.code}>
{lang.name}
</option>
))}
</select>
<br></br>
<button type="submit">Translate</button>
</form>
<p>{translated}</p>
</div>
);
};
export default App; // Exporting as default
This React app allows users to input text, select source and target languages from dropdowns, and submit the text for translation. It uses useState
hooks to manage the input text, source language, target language, and translated text. When the form is submitted, a POST
request is sent to a backend API (/api/translate
), which processes the translation and returns the result. The app displays the translated text after receiving the response.
The code includes:
TranslateRequest
Interface: Defines the structure of the data sent to the API (text, source language, target language).- Language Selection: A dropdown menu allows users to choose the source and target languages dynamically.
- State Management: The
useState
hook is used for handling text input, language selection, and translated text.
Just add proxy to proceed with /api/ path correctly in vite.config.ts:
import { defineConfig } from 'vite';
import react from '@vitejs/plugin-react';
// https://vitejs.dev/config/
export default defineConfig({
plugins: [react()],
server: {
proxy: {
'/api': {
target: 'http://127.0.0.1:8000',
changeOrigin: true,
rewrite: (path) => path.replace(/^\/api/, '')
}
}
}
});
Run the app
npm run dev
What you should see in the browser windows (mind the port name in terminal window)
Let’s check the performance (MacBook Pro M1Max, GPU not supported):
Use http://127.0.0.1:8000/docs#/default for Swager (thanks to FastAPI Swagger available by default so no new code needed):
And result, not too bad, under 2 sec:
Once the translation model is loaded, subsequent translations might not require loading the full model into memory again. This means that only minimal memory is allocated during translation, which results in low memory usage for each translation task after the initial load.
Time to execute raw data:
Basic visualisation and analysis:
As you can see there is a close to linear function of text lengths vs execute time which is expected from Helsinki-NLP translation model
Summary:
In this article we’ve created client (React JS, Vite, TypeScript) and Server (Python FastAPI, Transformers library, Helsinki-NLP translation model) and analyzed performance of this solution.