Skip to content

feat: expose language detection probabilities to server example #3044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 28, 2025

Conversation

sachaarbonel
Copy link
Contributor

Description:
This PR enhances the JSON API response by adding detailed language detection information when transcribing or translating audio. The changes include:

  1. Language detection probabilities for the detected language
  2. A comprehensive list of language probabilities for all languages with non-negligible confidence scores (>0.001)
  3. Integration with Whisper's existing language detection capabilities

The new information is added under a language_detection field in the JSON response, containing:

  • probability: Confidence score for the detected language
  • language_probabilities: Map of language codes to their detection probabilities

This enhancement provides more transparency into the language detection process and can be valuable for applications requiring confidence scores in language identification.

The changes are non-breaking and only add additional information to the existing JSON response structure.

Example Output:

{
  "task": "transcribe",
  "language": "english",
  "text": "This is the transcribed text of the audio file.",
  "language_detection": {
    "probability": 0.982,
    "language_probabilities": {
      "en": 0.982,
      "fr": 0.008,
      "es": 0.005,
      "de": 0.003
    }
  },
  "segments": [
    // ... segments array content ...
  ]
}

In this example:

  • The main detected language (English) has a 98.2% confidence score
  • Other languages with lower probabilities are also included
  • Only languages with probabilities > 0.001 (0.1%) are shown
  • The original JSON structure remains intact, with the new language_detection field added

@sachaarbonel sachaarbonel changed the title feat: expose language detection probabilities to server.cpp feat: expose language detection probabilities to server example Apr 14, 2025
@sachaarbonel
Copy link
Contributor Author

@danbev I addressed the code review comments, can you review again please

@danbev danbev merged commit f0171f0 into ggml-org:master Apr 28, 2025
51 of 52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants