
Today, I’ll introduce this n8n workflow. Its core function is to receive images from Telegram chats, automatically extract text from the images and store it in Airtable, and back up the original images to AWS S3.
1. Core Node Analysis
The workflow contains 4 core nodes, which form a complete process through data connection:
1. Telegram Trigger (Trigger Node)
- Function: Listen to the messages of the Telegram bot and serve as the “start switch” for the entire workflow.
- Key configuration :
updates: ["*"]
: Listen for all types of message updates (text, images, files, etc.).download: true
+imageSize: "medium"
: If a picture message is received, automatically download the binary data of the medium-sized picture (for subsequent processing).- Credentials: Associated with the Telegram Bot API credentials named “Telegram mybot” to ensure that the bot messages can be received.
2. AWS S3 (Storage Node)
- Function: Upload the pictures received from Telegram to AWS S3 cloud storage to back up the original pictures.
- Key configuration :
operation: "upload"
: Perform the “upload” operation.bucketName: "textract-demodata"
: Specify the target S3 bucket for upload (textract-demodata
).fileName: {{$binary.data.fileName}}
: Read the original file name from the image downloaded from Telegram and use it as the storage name in S3 (to avoid file name conflicts).- Credentials: Use AWS credentials (ID 9) with write access to the S3 bucket.
3. AWS Textract (Text Extraction Node)
- Function: Call AWS Textract’s OCR (Optical Character Recognition) service to extract text content from images.
- Key configuration :
- No additional parameters (default use
DetectDocumentText
interface, extract all text in the picture, including structured information such as paragraphs, lines, words, etc.). - Credentials: Share the same AWS credentials with AWS S3 to ensure that the Textract service can be invoked.
- No additional parameters (default use
4. Airtable (Data Storage Node)
- Function: Store the text data extracted by Textract into the Airtable table for structured management
- Key configuration :
operation: "append"
: Append new records to the table (without overwriting historical data).table: "receipts"
: The target table is named “receipts”.application: "qwertz"
: Associate the “qwertz” application (i.e., workspace) of Airtable.- Credentials: Use Airtable API credentials (ID 6) with table write permissions.
2. Data Flow and Complete Process
- Trigger condition : The user sends a picture message (such as a receipt photo, screenshot, etc.) to the Telegram robot
- Data Shunt : After the Telegram Trigger node downloads the picture, it will send the picture data to AWS S3 and AWS Textract at the same time (parallel processing).
- Image backup : AWS S3 node uploads the image to
textract - demodata
bucket to ensure the original file is traceable. - Text extraction: AWS Textract performs OCR on the image to extract text content (such as amount, date, merchant name, etc. on the receipt).
- Data storage : The extracted text is appended to the “receipts” table through the Airtable node to form a structured text record (for subsequent query, statistics or automated reconciliation).
3. Use Cases and Application Scenarios
Combined with the node configuration (especially the Airtable table name “receipts”), the most typical use of this workflow is to automate the digital management of receipts/invoices. Specific scenarios include:
- Personal/Business Bookkeeping: Users take photos of receipts and send them to Telegram. Key information such as amount and date is automatically extracted and stored in Airtable, replacing manual entry.
- Simplified Reimbursement Process : Employees upload reimbursement invoices, the system automatically extracts information and aggregates it to Airtable, and the finance department can directly view the Structured Data.
- Document Archiving : Back up the original pictures to AWS S3 and store the text information in Airtable to achieve double backup of “picture + text” for easy retrieval (such as filtering receipts by date/amount).
Template download: