发布于 2026-01-06 1 阅读
0

🧠🎤 FluentMate - 您的智能流利度提升伙伴和全天候导师 💬🤖 🎙️ 说、反思、提升

🧠🎤 FluentMate - 你的智能流利度提升伙伴和全天候导师 💬🤖

🎙️ 发言 反思 改进

这是参加AssemblyAI 语音代理挑战赛的作品。

我建造的

Speak Reflect Improve 是您随时在线的智能英语流利度导师。它是一款实时语音助手,能够精准分析您的英语口语,并就发音、语法、流利度、词汇等方面提供详细的反馈。

项目首页

无论你是准备求职面试、公开演讲,还是仅仅为了增强自信,“Speak Reflect Improve”都能帮助你进行自我反思,并让你每天都能更好地表达自己。

它符合挑战中的这两个提示类别:

  • 业务自动化——因为它提供可扩展的、全天候的个性化流利度辅导,有助于职业发展、入职或技能提升。

  • 🎓领域专家- 因为它了解英语交流的细微差别,并担任流利度、习语和 CEFR 等级标准方面的资深导师。


🎯我为什么建造它

我性格内向。现实生活中,我不怎么跟人说话。但生活不会因为你的羞怯而停滞不前。面试、小组讨论、提案——所有这些都需要出色的沟通能力。

而且我没有人可以一起练习。没有老师可以随时指导。凌晨两点也没有流利的伙伴可以帮我纠正错误。

于是我开发了 Speak Reflect Improve——一款智能、无评判、全天候的口语助手,它能倾听、评分、纠正并鼓励你。它已经成为我的无声导师,我希望它也能成为你的导师。


💡功能

  • 🎙️实时口语评估

    • 录制你的声音,主题不限,或者选择随机提示。
    • 您可以选择短时(2 分钟)或长时(7 分钟)的评估。
  • 📈人工智能驱动的流畅度反馈

    • 发音、词汇、语法、流利度、停顿、语气词、连贯性、习语运用——它涵盖了所有这些方面。
    • 满分 10 分,并按 CEFR 等级(A1 至 C2)进行映射。
  • 🎯由流利度专家提供的定制辅导

    • 个性化建议,助您更上一层楼。
    • 可操作的建议、目标和激励性反馈。
  • 🧠专为学习而生

    • 旨在改善学生、开发人员、求职者和全球演讲者的实际沟通。

🌐演示

  • 🔗实时网站:

点击这里查看我的网站👇(请稍等片刻,网站正在加载🥹🥹,或者您可以先观看演示视频😅)

表达 反思 改进

  • 🎥视频演示:

来看看这段视频,我在里面展示或录制我的项目🤔🤔:


💻 GitHub 仓库

请查看下方我的 GitHub 代码库。或许你想看看代码,或者直接深入研究、克隆、fork 或贡献代码😁!

🎙️ 发言 反思 改进

自然地开口说,让我们的人工智能分析您各个方面的英语水平。

请点击此处查看我的项目截图:

screencapture-127-0-0-1-5000-2025-07-28-05_40_34 图像

点击此处查看(实时版本):-发言、反思、改进


特征

🎤 语音分析

  • 发音评估:对发音清晰度、口音和语调进行详细分析
  • 词汇评估:评估词汇范围、复杂程度和恰当性
  • 语法分析:语法准确性、句子结构和复杂性评估
  • 流畅性评估:语流、停顿模式和填充词检测
  • 连贯性和组织性:逻辑流程和思路联系分析
  • 习语与短语:自然表达与习语用法评估

📊 能力等级评定

  • 欧洲语言共同参考框架(CEFR)等级评估:A1、A2、B1、B2、C1、C2 等级判定
  • 总体评分:1-10分制,附详细说明
  • 可执行的反馈:具体的改进建议

🤖 人工智能教练

  • 实时反馈:即时纠正和建议
  • 个性化建议:定制化的改进策略
  • 文化背景:自然表达和……

⚙️技术栈

技术 目的
烧瓶 用于路由和逻辑的 Python 后端
AssemblyAI 实时流式语音转文本
Groq API 基于LLM的快速英语分析
JavaScript 用户界面交互和音频逻辑
HTML/CSS 响应式前端,采用深色主题
Web API 通过 MediaRecorder 访问麦克风
HTTPS 浏览器中进行音频录制需要此设备

🧠技术实现与 AssemblyAI 集成

该应用程序的核心在于使用 AssemblyAI 的Universal-Streaming API进行实时转录,以及使用 Groq 的 Llama 3 模型进行分析。

以下是演示如何在英语流利度教练项目中集成 AssemblyAI 的代码片段:

🎯 1. AssemblyAI 初始化和配置

# utils/voice_manager.py - AssemblyAI Setup
import assemblyai as aai

class VoiceManager:
    def __init__(self, api_keys: Dict[str, str]):
        self.api_keys = api_keys
        self.assemblyai_available = False
        self._init_assemblyai()

    def _init_assemblyai(self):
        if ASSEMBLYAI_AVAILABLE and self.api_keys.get('ASSEMBLYAI_API_KEY'):
            try:
                aai.settings.api_key = self.api_keys['ASSEMBLYAI_API_KEY']

                test_config = aai.TranscriptionConfig(
                    language_detection=True,    
                    punctuate=True,            
                    format_text=True,          
                    speaker_labels=False,      
                    auto_highlights=False      
                )

                self.assemblyai_available = True
                print("✅ AssemblyAI initialized successfully")

            except Exception as e:
                print(f"❌ AssemblyAI initialization failed: {e}")
                self.assemblyai_available = False
Enter fullscreen mode Exit fullscreen mode

这段代码使用针对英语流利度评估优化的配置初始化 AssemblyAI SDK。它设置了
通用支持的语言检测,启用了用于专业分析的标点符号和文本格式,并配置了
单说话人优化。初始化过程包含全面的错误处理和状态报告,确保流利度辅导平台中语音转文本处理的可靠设置。

🎤 2. 采用双模式的核心音频转录

# utils/voice_manager.py - Main Transcription Function
def transcribe_audio(self, audio_file_path: str) -> str:
    if not os.path.exists(audio_file_path):
        return "❌ Audio file not found"

    if self.assemblyai_available:
        try:
            print("🔄 Trying AssemblyAI SDK...")

            config = aai.TranscriptionConfig(
                language_detection=True,    
                punctuate=True,            
                format_text=True,          
                speaker_labels=False,      
                auto_highlights=False      
            )

            transcriber = aai.Transcriber(config=config)
            transcript = transcriber.transcribe(audio_file_path)

            if transcript.status == "completed":
                print("✅ AssemblyAI SDK transcription successful")
                return self._clean_transcription(transcript.text)
            elif transcript.status == "error":
                print(f"❌ AssemblyAI SDK error: {transcript.error}")
                return f"❌ Transcription error: {transcript.error}"

        except Exception as e:
            print(f"❌ AssemblyAI SDK error: {e}")

    if self.api_keys.get('ASSEMBLYAI_API_KEY'):
        try:
            print("🔄 Trying AssemblyAI Direct API...")
            result = self._transcribe_with_api(audio_file_path)
            if result and not result.startswith("❌"):
                print("✅ AssemblyAI API transcription successful")
                return self._clean_transcription(result)
        except Exception as e:
            print(f"❌ AssemblyAI API error: {e}")

    return "❌ Transcription failed. Please check API configuration."
Enter fullscreen mode Exit fullscreen mode

该功能采用双模式转录方法,同时使用 AssemblyAI SDK 和直接 API 调用,以确保最高的可靠性。主要方法使用经过增强配置的 SDK,该配置针对英语流利度评估进行了优化,包括语言检测和专业格式设置。如果 SDK 出现故障,则会自动回退到直接 API 调用,从而确保流利度辅导平台能够持续获得转录服务。

🔧 3. 直接 API 实现及增强功能

# utils/voice_manager.py - Direct API Implementation
def _transcribe_with_api(self, audio_file_path: str) -> str:
    try:
        headers = {'authorization': self.api_keys['ASSEMBLYAI_API_KEY']}

        print("📤 Uploading audio file...")
        with open(audio_file_path, 'rb') as f:
            response = requests.post(
                'https://api.assemblyai.com/v2/upload',
                headers=headers,
                files={'file': f},
                timeout=60
            )

        if response.status_code != 200:
            return f"❌ Upload failed: {response.status_code} - {response.text}"

        upload_url = response.json()['upload_url']
        print(f"✅ File uploaded: {upload_url}")

        print("🔄 Requesting transcription...")
        data = {
            'audio_url': upload_url,
            'language_detection': True,     
            'punctuate': True,             
            'format_text': True,           
            'speaker_labels': False,       
            'auto_highlights': False       
        }

        response = requests.post(
            'https://api.assemblyai.com/v2/transcript',
            headers=headers,
            json=data,
            timeout=30
        )

        if response.status_code != 200:
            return f"❌ Transcription request failed: {response.status_code}"

        transcript_id = response.json()['id']
        print(f"🔄 Transcription ID: {transcript_id}")

        print("⏳ Waiting for transcription to complete...")
        max_attempts = 60  # 2-minute timeout
        attempt = 0

        while attempt < max_attempts:
            response = requests.get(
                f'https://api.assemblyai.com/v2/transcript/{transcript_id}',
                headers=headers,
                timeout=30
            )

            if response.status_code != 200:
                return f"❌ Status check failed: {response.status_code}"

            result = response.json()
            status = result['status']

            if status == 'completed':
                print("✅ Transcription completed")
                return result['text'] or "❌ No text in transcription result"
            elif status == 'error':
                error_msg = result.get('error', 'Unknown error')
                return f"❌ Transcription error: {error_msg}"
            elif status in ['queued', 'processing']:
                print(f"⏳ Status: {status} (attempt {attempt + 1}/{max_attempts})")
                import time
                time.sleep(2)  # 2-second polling interval
                attempt += 1
            else:
                return f"❌ Unknown status: {status}"

        return "❌ Transcription timeout - took too long to process"

    except requests.exceptions.Timeout:
        return "❌ Request timeout - please try again"
    except Exception as e:
        return f"❌ Unexpected error: {str(e)}"
Enter fullscreen mode Exit fullscreen mode

此函数采用直接调用 AssemblyAI API 作为备用机制。它处理完整的
转录工作流程:文件上传、增强功能的转录请求以及带超时的智能轮询。该
实现包含针对网络问题、API 故障和超时的全面错误处理机制。它采用 2 秒轮询
间隔和 2 分钟最大超时时间,确保英语流利度评估的可靠转录处理。

🧹 4. 高级文本处理与验证

def _clean_transcription(self, text: str) -> str:
    if not text:
        return "❌ Empty transcription result"

    text = text.strip()

    text = re.sub(r'\s+', ' ', text)

    text = re.sub(r'([.!?])\s*([a-z])',
                  lambda m: m.group(1) + ' ' + m.group(2).upper(), text)

    if text and not text[0].isupper():
        text = text[0].upper() + text[1:]

    if text and text[-1] not in '.!?':
        text += '.'

    return text

def validate_audio_file(self, file_path: str) -> Dict[str, any]:
    if not os.path.exists(file_path):
        return {
            'valid': False,
            'error': 'File does not exist',
            'file_size': 0
        }

    file_size = os.path.getsize(file_path)
    max_size = 100 * 1024 * 1024  # 100MB AssemblyAI limit

    if file_size > max_size:
        return {
            'valid': False,
            'error': f'File too large: {file_size / (1024*1024):.1f}MB (max 100MB)',
            'file_size': file_size
        }

    if file_size < 1000:  # Minimum viable audio size
        return {
            'valid': False,
            'error': 'File too small - may be empty or corrupted',
            'file_size': file_size
        }

    return {
        'valid': True,
        'error': None,
        'file_size': file_size,
        'file_size_mb': file_size / (1024 * 1024)
    }
Enter fullscreen mode Exit fullscreen mode

这段代码负责转录后的文本处理和音频文件验证。清理功能
会规范化空格、修正大小写,并确保句子结构正确,从而进行准确的流畅性分析。验证
功能会检查文件是否存在、大小是否符合限制(AssemblyAI 最大 100MB)以及最小可用大小,以防止处理空文件或
损坏的文件。这确保了英语流畅性评估系统能够获得高质量的输入。

🌐 5. Flask 集成和语音分析端点

@app.route('/analyze_speech', methods=['POST'])
def analyze_speech():
    try:
        audio_file = request.files.get('audio')
        assessment_type = request.form.get('type', 'general')

        if not audio_file:
            return jsonify({'error': 'No audio file provided'}), 400

        session_id = session.get('session_id', datetime.now().strftime("%Y%m%d_%H%M%S"))
        session['session_id'] = session_id

        temp_path = f"temp/audio_{session_id}_{assessment_type}.wav"
        os.makedirs('temp', exist_ok=True)
        audio_file.save(temp_path)

        print(f"🔄 Starting speech analysis of {temp_path}")

        transcription = voice_manager.transcribe_audio(temp_path)

        if transcription.startswith("❌"):
            return jsonify({'error': transcription}), 500

        analysis = english_analyzer.analyze_speech_proficiency(
            transcription=transcription,
            audio_file_path=temp_path,
            assessment_type=assessment_type
        )

        if os.path.exists(temp_path):
            os.remove(temp_path)
        session['last_analysis'] = analysis
        session['last_transcription'] = transcription

        print(f"✅ Speech analysis completed: Grade {analysis.get('overall_grade', 'N/A')}")

        return jsonify({
            'success': True,
            'transcription': transcription,
            'analysis': analysis
        })

    except Exception as e:
        print(f"❌ Speech analysis error: {e}")
        return jsonify({'error': f'Speech analysis failed: {str(e)}'}), 500
Enter fullscreen mode Exit fullscreen mode

这个 Flask 接口负责协调完整的语音分析工作流程。它处理音频文件上传、
创建基于会话的临时文件、通过 AssemblyAI 处理转录、执行全面的英语水平
分析,并管理清理工作。该接口包含强大的错误处理机制、用于用户跟踪的会话管理功能,并返回
详细的分析结果,包括转录和水平评估,供英语流利度辅导平台使用。

🎓 6. 人工智能驱动的英语水平分析

def analyze_speech_proficiency(self, transcription: str, audio_file_path: str = None,assessment_type: str = 'general') -> Dict:

    try:
        audio_analysis = self._analyze_audio_characteristics(audio_file_path) if audio_file_path else {}

        text_analysis = self._analyze_text_proficiency(transcription, assessment_type)

        combined_analysis = self._combine_analyses(text_analysis, audio_analysis, transcription)

        return combined_analysis

    except Exception as e:
        print(f"❌ Speech proficiency analysis error: {e}")
        raise

def _create_proficiency_prompt(self, transcription: str, assessment_type: str) -> str:

    return f"""
You are a world-renowned English language proficiency expert and certified TESOL instructor with 20+ years of experience. You are known for providing detailed, professional analysis like a premium English tutor.

ASSESSMENT TYPE: {assessment_type}
TRANSCRIPTION TO ANALYZE: "{transcription}"

Provide a comprehensive professional English assessment following this EXACT format:

## PRONUNCIATION ANALYSIS
[Detailed analysis of pronunciation quality, clarity, accent, stress patterns, intonation, and specific sounds. Be specific about what sounds good and what needs improvement.]

## VOCABULARY ASSESSMENT
[Analyze vocabulary range, sophistication, word choice appropriateness, and lexical diversity.]
- Advanced words used: [list specific advanced words they used]
- Vocabulary level: [beginner/intermediate/advanced with explanation]
- Suggested word upgrades: [specific examples like "good → excellent"]

## GRAMMAR EVALUATION
[Assess grammar accuracy, sentence structure complexity, tense usage, and error patterns.]
- Grammar strengths: [specific examples of correct usage]
- Areas for improvement: [specific grammar points to work on]
- Sentence complexity: [analysis of their sentence structures]

## FLUENCY ANALYSIS
[Evaluate speech flow, hesitations, filler words, pace, natural rhythm, and speaking rate]
- Filler words detected: [count and list them]
- Speaking rate assessment: [words per minute if calculable]
- Flow quality: [detailed assessment]

## COHERENCE & ORGANIZATION
[Assess logical flow, idea connection, topic development, clarity of expression]

## CEFR LEVEL ASSESSMENT
CEFR Level: [A1, A2, B1, B2, C1, or C2]
Level Description: [Beginner/Elementary/Intermediate/Upper-Intermediate/Advanced/Proficient]

## DETAILED PROFESSIONAL FEEDBACK
[Provide encouraging, detailed feedback like a professional English tutor. Be specific about their current abilities and growth potential.]

## PRIORITY FOCUS AREAS
[List the top 2-3 areas they should focus on immediately for maximum improvement]

Be detailed, professional, encouraging, and specific. Provide the kind of analysis a student would get from a premium English tutor.
"""
Enter fullscreen mode Exit fullscreen mode

该代码利用人工智能技术实现了全面的英语水平分析。它结合了转录分析和音频特征,从发音、词汇、语法、流利度和连贯性等方面提供详细的评估。系统生成的反馈堪比专业级英语辅导,包括欧洲语言共同参考框架(CEFR)等级评估、具体的改进建议和重点学习领域。分析提示旨在为英语学习者提供详尽且可操作的见解。

🎯 7. 个性化流利度辅导系统

def provide_coaching(self, user_input: str, practice_type: str, topic: str, user_level: str) -> Dict:
    try:
        coaching_prompt = self._create_coaching_prompt(user_input, practice_type, topic, user_level)

        coaching_response = self._get_ai_response(coaching_prompt)
        parsed_coaching = self._parse_coaching_response(coaching_response)

        return parsed_coaching

    except Exception as e:
        print(f"❌ Coaching error: {e}")
        return self._generate_fallback_coaching(user_input, practice_type)

def _create_coaching_prompt(self, user_input: str, practice_type: str, topic: str, user_level: str) -> str:

    return f"""
You are an expert English fluency coach with 15+ years of experience helping students improve their speaking skills. You specialize in providing constructive, encouraging, and actionable feedback.

PRACTICE TYPE: {practice_type}
TOPIC: {topic}
USER LEVEL: {user_level}
USER INPUT: "{user_input}"

Provide comprehensive coaching feedback following this EXACT format:

## IMMEDIATE FEEDBACK
[Provide immediate positive reinforcement and acknowledgment of their effort]

## CORRECTIONS
[List specific corrections needed with explanations]
- Original: [what they said]
- Corrected: [how it should be said]
- Explanation: [why this correction is needed]

## PRONUNCIATION NOTES
[Specific pronunciation feedback and tips]

## VOCABULARY ENHANCEMENT
[Suggest better word choices or more advanced vocabulary]
- Instead of: [basic word/phrase]
- Try using: [advanced alternative]
- Example: [sentence using the advanced word]

## GRAMMAR IMPROVEMENTS
[Point out grammar issues and provide corrections]

## FLUENCY TIPS
[Specific tips to improve natural flow and reduce hesitations]

## CULTURAL CONTEXT
[Explain any cultural nuances or more natural expressions]

## PRACTICE SUGGESTION
[Specific practice exercise based on their performance]

## ENCOURAGEMENT
[Motivational message highlighting their progress and strengths]

Provide specific, actionable feedback that helps them improve their English fluency naturally and confidently.
"""

def generate_practice_prompt(self, practice_type: str, topic: str, level: str) -> Dict:
    """
    Generate practice prompts for different types of exercises
    """
    prompts = {
        'conversation': self._get_conversation_prompts(topic, level),
        'pronunciation': self._get_pronunciation_prompts(level),
        'vocabulary': self._get_vocabulary_prompts(topic, level),
        'storytelling': self._get_storytelling_prompts(topic, level),
        'debate': self._get_debate_prompts(topic, level),
        'presentation': self._get_presentation_prompts(topic, level),
        'song_analysis': self._get_song_prompts(level)
    }

    return prompts.get(practice_type, prompts['conversation'])
Enter fullscreen mode Exit fullscreen mode

这段代码实现了基于人工智能的个性化流利度辅导,能够根据用户的输入、练习类型和熟练程度提供具体、可操作的反馈。系统提供纠错、发音提示、词汇拓展、语法改进和文化背景讲解。它为七种不同的练习类型(对话、发音、词汇、讲故事、辩论、演讲、歌曲分析)生成练习提示,并根据用户的水平调整辅导方式,提供鼓励性且建设性的反馈,从而自然地提高英语流利度。


🧪使用指南

  1. 返回首页 →开始评估
  2. 请选择以下其中一项评估:
    • 快速(2分钟)
    • 深度解析(5-7分钟)
    • 基于主题的
  3. 自然地说话。人工智能会聆听、转录和分析。
  4. 立即查看:
  • 流畅度评分(1-10)
  • 欧洲语言共同参考框架等级(A1–C2)
  • 反馈
  • 个性化提示

📊结果解读

  • 分数

    • 1-3:基础,需要改进
    • 4–6:中级
    • 7-8:高级
    • 9-10:专业水平
  • 欧洲语言共同参考框架

    • A1–A2:初级
    • B1–B2:熟练用户
    • C1–C2:专家级精通

结语

这不仅仅是一个黑客马拉松项目。

这是我,学习说话。
这是你,打破沉默。
这是我们,用每一个字,让世界变得更加清晰。

作为一个性格内向、不爱说话的程序员,我开发这个软件是因为我不得不这么做。不是为了赢,而是为了生存,为了成长。

如果这个用代码、希望和自我修复能力构建的小型语音助手能够帮助哪怕一个人更好地说话,那它就已经成功了。

如果你也相信那些默默无闻的人,那就投票给“发言、反思、改进”组织吧——因为有时候,他们才能掀起最响亮的革命。

🫶 爱你的,
迪维娅 💗💗

谢谢你的gif

文章来源:https://dev.to/divyasinghdev/fluencemate-your-smart-fluency-friend-24x7-mentor-40f5