RWTH ASR Speech and Voice Recognition App


by RWTH Aachen

The RWTH Aachen University Speech Recognition System
Helps with: Speech and Voice Recognition
Similar to: eSpeak App CMUSphinx Toolkit App RealSense SDK App ATT SDK App More...
Source Type: Closed
License Types:
Supported OS:
Languages: CPP

What is it all about?

RWTH ASR (short "RASR") is a software package containing a speech recognition decoder together with tools for the development of acoustic models, for use in speech recognition systems. It has been developed by the Human Language Technology and Pattern Recognition Group at the RWTH Aachen University since 2001. Speech recognition systems developed using this framework have been applied successfully in several international research projects and corresponding evaluations.

Key Features

• Decoder for large vocabulary continuous speech recognition ◦word conditioned tree search (supporting across-word models) ◦optimized HMM emission probability calculation using SIMD instructions ◦refined acoustic pruning using language model lookahead ◦word lattice generation • Feature extraction ◦a flexible framework for data processing: Flow ◦MFCC features ◦PLP features ◦Gammatone features ◦voicedness feature ◦vocal tract length normalization (VTLN) ◦support for several feature dimension reduction methods (e.g. LDA, PCA) ◦easy implementation of new features as well as easy integration of external features using Flow networks • Acoustic modeling ◦Gaussian mixture distributions for HMM emission probabilities ◦phoneme in triphone context (or shorter context) ◦across-word context dependency of phonemes ◦allophone parameter tying using phonetic decision trees (classification and regression trees, CART) ◦globally pooled diagonal covariance matrix (other types of covariance modelling are possible, but not fully tested) ◦maximum likelihood training ◦discriminative training (minimum phone error (MPE) criterion) ◦linear algebra support using LAPACK, BLAS • Language modeling ◦support for language models in ARPA format ◦weighted grammars (weighted finite state automaton) • Neural networks (new in v0.6) ◦training of arbitrarily deep feed-forward networks ◦CUDA support for running on GPUs ◦OpenMP support for running on CPUs ◦variety of activation functions, training criteria and optimization algorithms ◦sequence discriminative training, e.g. MMI or MPE (new in v0.7) ◦integration in feature extraction pipeline ("Tandem approach") ◦integration in search and lattice processing pipeline ("Hybrid NN/HMM approach") • Speaker adaptation ◦Constrained MLLR (CMLLR, "feature space MLLR", fMLLR) ◦Unsupervised maximum likelihood linear regression mean adaptation (MLLR) ◦speaker / segment clustering using Bayesian Information Criterion (BIC) as stop criterion • Lattice processing ◦n-best list generation ◦confusion network generation and decoding ◦lattice rescoring ◦lattice based system combination •input / output formats ◦nearly all input and output data is in easily process-able XML or plain text formats ◦converter tools for the generation of NIST file formats are included ◦HTK lattice format ◦converter tools for HTK models


Trial With Card
Trial No Card
By Quote


Register to the site


View More Alternatives

View Less Alternatives

Top DiscoverSDK Experts

User photo
ramson jovial
Habakkuk DAniel the owner of waploaded music industry
Multimedia | Hardware and RT and 122 more
View Profile
User photo
ahmedxp kh
Ahmedxp PC ENG
Multimedia | Hardware and RT and 123 more
View Profile
User photo
Redentor Del Rosario
Cyber Security
Multimedia | Hardware and RT and 122 more
View Profile
User photo
Ashton Torrence
Web and Windows developer
GUI | Web and 11 more
View Profile
Show All

Interested in becoming a DiscoverSDK Expert? Learn more


Compare Products

Select up to three two products to compare by clicking on the compare icon () of each product.


Now comparing:

{{product.ProductName | createSubstring:25}} X
Compare Now