Pronunciation Assessment (streaming version) API documentation

Interface description

The ability interface for automatic evaluation of pronunciation level, pronunciation errors, defect localization and problem analysis through intelligent speech technology. The core technologies involved can be mainly divided into two parts: automatic evaluation technology of Chinese Mandarin pronunciation level and automatic evaluation technology of English pronunciation level.

  1. Get the authentication code: apply for appid from IFLYTEK open platform and add (streaming interface) to get the interface key APIKey and APISecret
  2. Integration of Websocket interface: generic interface + parameter description, Chinese and English test question format will be different, see test question format description test question format description

Interface Demo

Example demo Please click here to download.
Currently, we only provide demos for some development languages, please refer to the following interface documentation for other languages.

Interface requirements

Contents Description
request protocol ws
request address ws://ise-api-sg.xf-yun.com/v2/ise
Interface Authentication Signature mechanisms, see Interface Authentication below for details.
Development Languages Any, as long as you can initiate Websocket requests to Our Cloud Services
Audio Properties Sampling Rate 16k, Bit Length 16bit, Mono
audio format pcm, wav, mp3 (need to change the value of aue to lame), speex-wb;
Audio size Audio data sending session length cannot exceed 5 minutes
Language Type Chinese, English

Interface call flow

  1. Parameter uploading phase, as detailed in the description of business parameters (business):
    Parameter first upload, data.status=0,and set cmd="ssb";
  2. Audio upload phase, during which audio data uploading begins:
    The first frame of audio needs to be set with cmd="auw", aus=1, data.status=1;
    Intermediate frame audio needs to be set with cmd="auw", aus=2, data.status=1;
    The last frame of audio needs to be set with cmd="auw", aus=4, and data.status=2;

Interface authentication

In the handshake phase, the requestor needs to sign the request, and the server verifies the legitimacy of the request through the signature.

Authentication Methods

By adding authentication related parameters after the request address. Example url:

ws://ise-api-sg.xf-yun.com/v2/ise?authorization=
YXBpX2tleT0ia2V5eHh4eHh4eHg4ZWUyNzkzNDg1MTlleHh4eHh4eHgiLCBhbGdvc
ml0aG09ImhtYWMtc2hhMjU2IiwgaGVhZGVycz0iaG9zdCBkYXRlIHJlcXVlc3QtbGluZ
SIsIHNpZ25hdHVyZT0iV0MxdFR6MkRJK0E4bktQTmh6N3Q3bEloRzFWQktEaEQzSy
tSM0trQ0hPcz0i &host=ise-api.xfyun.cn&date=Tue%2C+22+Dec+2020+06%3A29%3A31+GMT

Authentication Parameters:

Parameters Type Required Description Example
host string Yes request host ise-api-sg.xf-yun.com
date string Yes Current timestamp, RFC1123 format Wed, 10 Jul 2019 07:35:43 GMT
authorization string Yes Signature-related information using base64 encoding(signature is calculated based on hmac-sha256) See rules for generating authorization parameters below

authorization parameter detailed generation rules

(1) Get the interface keys APIKey and APISecret.
In the console page, you can check it after registration.
(2) The format of the parameter authorization base64 encoding before (authorization_origin) is as follows.

api_key="$api_key",algorithm="hmac-sha256",headers="host date request-line",signature="$signature"

where api_key is the APIKey obtained on the console, algorithm is the encryption algorithm (only hmac-sha256 is supported), and headers are the parameters involved in signing (see comments below).
signature is a string that is signed using a cryptographic algorithm and encoded with base64 for the parameters involved in the signature, see below.

Note: headers are the parameters involved in signing; note that it is the fixed parameter names ("host date request-line"), not the values of these parameters.

(3) The rules for the original field (signature_origin) of the signature are as follows.

The signature raw field consists of the host, date, and request-line parameters stitched together in the format
The format of the splice is (\n is a line break, ':' followed by a space):

host: $host\ndate: $date\n$request-line

suppose that...

request url = ws://ise-api-sg.xf-yun.com
date = Wed, 10 Jul 2019 07:35:43 GMT

Then the signature original field (signature_origin) is:

host: ise-api-sg.xf-yun.com
date: Wed, 10 Jul 2019 07:35:43 GMT
GET /v2/open-ise HTTP/1.1

(4) Sign signature_origin using hmac-sha256 algorithm in combination with apiSecret to get signed digest signature_sha.
signature_sha=hmac-sha256(signature_origin,$apiSecret)

where apiSecret is the APISecret obtained in the console
(5) Encode signature_sha using base64 encoding to get the final signature.
signature=base64(signature_sha)

Assumptions

APISecret = secretxxxxxxxxxx2df7900c09xxxxxxxxxxx
date = Wed, 10 Jul 2019 07:35:43 GMT

Then the signature is
signature=WC1tTz2DI+A8nKPNhz7t7lIhG1VBKDhD3K+R3KkCHOs=

(6) Based on the above information, splice the string before authorization base64 encoding (authorization_origin), the example is as follows.
api_key="keyxxxxxxxxxx8ee279348519exxxxxxxxxx", algorithm="hmac-sha256",
headers="host date request-line", signature="WC1tTz2DI+
A8nKPNhz7t7lIhG1VBKDhD3K+R3KkCHOs="

Note: headers are the parameters involved in signing; note that it is the fixed parameter names ("host date request-line"), not the values of these parameters.

(7) Finally, the authorization_origin is base64 encoded to get the final authorization parameter.

authorization = base64(authorization_origin)
Example:
authorization=
YXBpX2tleT0ia2V5eHh4eHh4eHg4ZWUyNzkzNDg1MTlleHh4eHh4eHgiLCBhbGdvc
ml0aG09ImhtYWMtc2hhMjU2IiwgaGVhZGVycz0iaG9zdCBkYXRlIHJlcXVlc3QtbGluZ
SIsIHNpZ25hdHVyZT0iV0MxdFR6MkRJK0E4bktQTmh6N3Q3bEloRzFWQktEaEQzSy
tSM0trQ0hPcz0i

authentication url example (Java)

public static String getAuthUrl(String hostUrl, String apiKey, String apiSecret) throws Exception {
  URL url = new URL(hostUrl);
  SimpleDateFormat format = new SimpleDateFormat("EEE, dd MMM yyyy HH:mm:ss z", Locale.US);
  format.setTimeZone(TimeZone.getTimeZone("GMT"));
  String date = format.format(new Date());
  //String date = format.format(new Date());
  //System.err.println(date);
  StringBuilder builder = new StringBuilder("host: ").append(url.getHost()).append("\n").//
    append("date: ").append(date).append("\n").//
    append("GET ").append(url.getPath()).append(" HTTP/1.1");
  //System.err.println(builder);
  Charset charset = Charset.forName("UTF-8");
  Mac mac = Mac.getInstance("hmacsha256");
  SecretKeySpec spec = new SecretKeySpec(apiSecret.getBytes(charset), "hmacsha256");
  mac.init(spec);
  byte[] hexDigits = mac.doFinal(builder.toString().getBytes(charset));
  String sha = Base64.getEncoder().encodeToString(hexDigits);
  //System.err.println(sha);
  String authorization = String.format("api_key=\"%s\", algorithm=\"%s\", headers=\"%s\", signature=\"%s\"", apiKey, "hmac-sha256", "host date request-line", sha);
  //System.err.println(authorization);
  HttpUrl httpUrl = HttpUrl.parse("https://" + url.getHost() + url.getPath()).newBuilder().//
    addQueryParameter("authorization", Base64.getEncoder().encodeToString(authorization.getBytes(charset))).//
    addQueryParameter("date", date).//
    addQueryParameter("host", url.getHost()).//
    build();
  return httpUrl.toString();
 }

Authentication results

If the handshake is successful, HTTP 101 status code will be returned, indicating that the protocol upgrade is successful; if the handshake fails, different HTTP Code status codes will be returned according to different error types, and at the same time carry the error description information, the detailed error descriptions are as follows:

HTTP Code Description Error Message Resolution
401 Missing authorization parameter {"message": "Unauthorized"} Check for authorization parameter, see authorization parameter generation rules authorization parameter generation rules
401 Signature Parameter Parsing Failed {"message": "HMAC signature cannot be verified"} Check if each parameter of the signature is missing or not, especially make sure that the Is the copied api_key correct?
401 Signature verification failed {"message": "HMAC signature does not match"} Signature verification failed, there are many possible reasons.
1. check if api_key,api_secret are correct
2. check if the parameters host, date, request-line to calculate the signature are spliced according to the protocol requirements.
3. check whether the base64 length of signature signature is normal(normal 44 bytes).
3. Check whether the base64 length of signature is normal (normal 44 bytes).
403 Clock offset verification failed {"message": "HMAC signature cannot be verified,a valid date or x-date header is required for HMAC Authentication"} Check if the server time is standardized, a difference of more than 5 minutes will report this error
403 IP whitelist validation failed {"message": "Your IP address is not allowed"} Can disable IP whitelisting on the console, or check if the IP address set in IP whitelisting is an external IP address of the local machine.

Handshake failure return example:

    HTTP/1.1 401 Forbidden
    Date: Thu, 06 Dec 2018 07:55:16 GMT
    Content-Length: 116
    Content-Type: text/plain; charset=utf-8
    {
        "message": "HMAC signature does not match"
    }

Interface data transmission and reception

After a successful handshake the client and server will establish a websocket connection, the client can upload and receive data at the same time through the websocket connection.

//Connection successful, start sending data
int frameSize = 1280; // the size of each audio frame, it is recommended to send 1280B every 40ms, the size can be adjusted, but do not exceed 19200B, that is, after the base64 compression can not be more than 26000B, otherwise, it will be reported as an error 10163 data length error.
int intervel = 40;
int status = 0; // status of the audio
try (FileInputStream fs = new FileInputStream(file)) {
 byte[] buffer = new byte[frameSize];
 //Send Audio
  1. The websocket-version supported by the server is 13, please make sure the framework used by the client supports this version.
  2. All frames returned by the server are of type TextMessage, which corresponds to opcode=1 in the protocol frame of native Websocket, please make sure that the frame type parsed by the client must be of this type, if not, please try to upgrade the version of the client framework, or change the technical framework.
  3. If there is a frame-splitting problem, that is, a json packet is returned to the client in multiple frames, resulting in failure of the client to parse the json. Most of the time, this problem is caused by the client's framework to Websocket protocol parsing problems, if it occurs, please try to upgrade the framework version, or replace the technical framework.
  4. client session ends if you need to close the connection, try to ensure that the Websocket error code passed to the server side is 1000 (if the client-side framework does not provide an interface to pass the error code when closing. )(If the client-side framework does not provide an interface to pass the error code when closing, then there is no need to focus on this article).
  5. Please note that the number of bytes in a frame size is different for different audio formats, we suggest: uncompressed PCM format, 40ms interval between each audio transmission, 1280B bytes per audio transmission; the size can be adjusted, but the maximum should not exceed 19200B, i.e., after base64 compression, it should not be more than 26000B, or else it will be reported as Error 10163 Data Too Long Error.

Request Parameters

The request data are all json strings

parameter name type mandatory description
common object YES public parameter that is only uploaded when the first frame is requested after a successful handshake, see below
business object YES business parameter that is uploaded when the first frame is requested after a successful handshake and when the subsequent data is sent, see below for details
data object YES business data flow parameter that needs to be uploaded in all requests after a successful handshake, see below
Public Parameter Description (COMMON)
parameter name type mandatory description
app_id string YES APPID information applied in the platform
Description of business parameters (business)
Parameter Name Type Mandatory Description Example
sub string YES Service type designation
ise (open for evaluation)
"ise"
ent string YES CHINESE:cn_vip
ENGLISH:en_vip
"cn_vip"
category string YES CHINESE QUESTION TYPE:
read_syllable(single word reading, for Chinese only)
read_word(words reading)
read_sentence(sentences reading)
read_chapter(passage reading)
ENGLISH QUESTION TYPE:
read_word(words reading)
read_sentence(sentences reading)
read_chapter(passage reading)
simple_expression(English circumstances reading)
read_choice(English multiple-choice)
topic(English free-response)
retell(English retelling)
picture_talk(English figure speaking)
oral_translation(English oral translation)
"read_sentence"
aus int YES When uploading audio to distinguish the state of the audio (When cmd = auw , the audio upload stage is a mandatory parameter)
1:he first frame of the audio
2:the middle of the audio
4:the last frame of the audio
value according to the upload stage
cmd string Yes used to differentiate between data upload stages
ssb:parameter upload stage
ttp:text upload stage (this stage can be skipped when ttp_skip=true, and the text in the text field will be used directly)
auw: audio upload stage
value according to the upload stage
text string YES text to be reviewed utf8 encoding, need to add utf8bom at head '\uFEFF'+text
tte string YES text-encoding-to-be-reviewed
utf-8
gbk
"utf-8"
ttp_skip bool YES Skip ttp and use the text in ssb directly for evaluation (use in conjunction with cmd parameters to see),default value true true
extra_ability string NO Extra_ability (valid condition ise_unite="1",rst="entirety")
Multi_dimension score information is displayed (accuracy score, fluency score, completeness score)
extra_ability value is multi_dimension (word and sentence are applicable, if more than one is selected, use a semicolon to separate them). dimension (words and phrases are applicable, such as selecting more than one ability, separated by a semicolon;. For example: add("extra_ability"," syll_phone_err_msg;pitch;multi_dimension"))
Word base frequency information display (base frequency start value, end value)
extra_ability value is pitch, only for word and sentence question types. For word and sentence questions only
Phonemic error information is displayed (whether or not the sound and tonal patterns are correct)
The value of extra_ability is syll_phone_err_msg (for both word and sentence questions, if more than one ability is selected, use a semicolon to separate them. For example: add("extra_ability"," syll_phone_err_msg;pitch;multi_dimension"))
"multi_dimension"
aue string NO audio format
raw: uncompressed audio in pcm format or wav (if using wav audio, it is recommended to remove the header)
lame: audio in mp3 format
speex-wb;7: Cyberlink customized audio in speex format(default)
"raw"
auf string NO audio sample rate
default audio/L16;rate=16000
"audio L16;rate=16000"
rstcd string NO Return result format
utf8
gbk (default)
"utf8"
group string NO For different groups, different audio scoring results for the same paper (only supported for Chinese words, phrases, sentences, and chapters), this parameter affects the accuracy_score
adult (adult group, defaults to adult if no group parameter is set)
youth (secondary school group)
pupil(Elementary school groups, Chinese sentence and chapter questions will return accuracy_score if this parameter is set.)
"adult"
check_type string NO Set the score and error checking threshold of the evaluation (only supported by Chinese engine)
easy:easy
common:common
hard:hard
"common"
grade string NO Set the parameters of the assessment's grade level (Chinese question types only: sentence and chapter types are supported for elementary and middle school)
junior(1,2grade)
middle(3,4grade)
senior(5,6grade)
"middle"
rst string NO Evaluate the returned results and scale control (Evaluate the returned results and scale control will also be affected by the ise_unite and plev parameters)
Complete: entirety (default)
Chinese percentile is recommended to pass the parameter (rst="entirety" and ise_unite="1" with extra_ability parameter)
English percentile is recommended to pass the parameter (rst="entirety" and ise_unite="1" with extra_ability parameter)
English percentile is recommended to pass the parameter (rst="entirety" and ise_unite="2" with extra_ability parameter) unite="1" and use with extra_ability parameter
English percentile recommend passing parameter (rst="entirety" and ise_unite="1" and use with extra_ability parameter)
Lite: plain (the evaluation will return only total score),for example:
<?xml version="1.0" ?><FinalResult><ret value="0"/><total_score value="98.507320"/></FinalResult>
"entirety"
ise_unite string NO return result control
0: no control (default)
1: control (the extra_ability parameter will affect the return of information such as full dimensions)
"0"
plev string NO Different values of plev with rst="entirety" (default) and ise_unite="0" (default) have an effect on the returned result.
plev: 0 (give all information, Chinese contains rec_node_type, perr_msg, fluency_score, phone_score; English contains accuracy_score, serr_msg, syll_accent, fluency_score, standard_score, and so on). score, standard_score, pitch information returned
"0"

Example of request parameters:
First data transmission:

{
  "common": {
    "app_id": "xxxxxxx"
  },
  "business": {
    "aue": "raw",
    "auf": "audio/L16;rate=16000",
    "category": "read_sentence",
    "cmd": "ssb",
    "ent": "en_vip",
    "sub": "ise",
    "text": "[content]When you don't know what you're doing, it's helpful to begin by learning about what you should not do. ",
    "ttp_skip": true
  },
  "data": {
    "status": 0
  }
}
Request Data Audio Parameters (DATA)
Parameter Name Type Mandatory Description Example
data string Yes audio data, base64 encoded audio data, base64 encoded as value
status string yes status of sent data
0 for the first time
1 for the middle data
2 for the last time
change the value according to the status of sent data

Follow-up data sending

{
  "business": {
    "cmd": "auw",
    "aus":1
  },
  "data": {
    "status": 1,
    "data":"PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4K"
  }
}

Request Data Audio Return Parameter Description
return parameter name type description
sid string The id of the current session, the same sid is returned for the same session
code int return code, 0 means the request was successful, when encountering other error codes means the request failed, the client should disconnect immediately to end the session,
for details of the error code list, see Error Code
message string The specific type of error description when the error occurred
data object returned data
data.data string Evaluation results, base64 string, parsed to xml format
status int Return the status of the result, when status=2, it means all the results are returned, the client should take the result when status=2 as the final result.

Return example:

{
 "code": 0,
 "message": "success",
 "sid": "isexxxxxxxxxxxxxxxxxxxxxxxxx",
 "data": {
  "status": 2,
  "data": "<?xml version="1.0" encoding="UTF-8"?>
  <xml_result>
      <read_sentence lan="cn" type="study" version="7,0,0,1024">
          <rec_paper>
              <read_sentence accuracy_score="100.000000" beg_pos="0" content="今天天气怎么样。" emotion_score="87.315361" end_pos="150" except_info="0" fluency_score="87.620300" integrity_score="100.000000" is_rejected="false" phone_score="100.000000" time_len="150" tone_score="100.000000" total_score="92.511200">
                  <sentence beg_pos="0" content="今天天气怎么样" end_pos="150" fluency_score="0.000000" phone_score="100.000000" time_len="150" tone_score="100.000000" total_score="86.959984">
                      <word beg_pos="0" content="今" end_pos="22" symbol="jin1" time_len="22">
                          <syll beg_pos="0" content="fil" dp_message="32" end_pos="1" rec_node_type="fil" time_len="1">
                              <phone beg_pos="0" content="fil" dp_message="32" end_pos="1" rec_node_type="fil" time_len="1"></phone>
                          </syll>
                          <syll beg_pos="1" content="今" dp_message="0" end_pos="22" rec_node_type="paper" symbol="jin1" time_len="21">
                              <phone beg_pos="1" content="j" dp_message="0" end_pos="4" is_yun="0" perr_level_msg="2" perr_msg="0" rec_node_type="paper" time_len="3"></phone>
                              <phone beg_pos="4" content="in" dp_message="0" end_pos="22" is_yun="1" mono_tone="TONE1" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="18"></phone>
                          </syll>
                      </word>
                      <word beg_pos="22" content="天" end_pos="40" symbol="tian1" time_len="18">
                          <syll beg_pos="22" content="天" dp_message="0" end_pos="40" rec_node_type="paper" symbol="tian1" time_len="18">
                              <phone beg_pos="22" content="t" dp_message="0" end_pos="30" is_yun="0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="8"></phone>
                              <phone beg_pos="30" content="ian" dp_message="0" end_pos="40" is_yun="1" mono_tone="TONE1" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="10"></phone>
                          </syll>
                      </word>
                      <word beg_pos="40" content="天" end_pos="58" symbol="tian1" time_len="18">
                          <syll beg_pos="40" content="天" dp_message="0" end_pos="58" rec_node_type="paper" symbol="tian1" time_len="18">
                              <phone beg_pos="40" content="t" dp_message="0" end_pos="46" is_yun="0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="6"></phone>
                              <phone beg_pos="46" content="ian" dp_message="0" end_pos="58" is_yun="1" mono_tone="TONE1" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="12"></phone>
                          </syll>
                      </word>
                      <word beg_pos="58" content="气" end_pos="74" symbol="qi9" time_len="16">
                          <syll beg_pos="58" content="气" dp_message="0" end_pos="74" rec_node_type="paper" symbol="qi0" time_len="16">
                              <phone beg_pos="58" content="q" dp_message="0" end_pos="66" is_yun="0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="8"></phone>
                              <phone beg_pos="66" content="i" dp_message="0" end_pos="74" is_yun="1" mono_tone="TONE0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="8"></phone>
                          </syll>
                      </word>
                      <word beg_pos="74" content="怎" end_pos="84" symbol="zen3" time_len="10">
                          <syll beg_pos="74" content="怎" dp_message="0" end_pos="84" rec_node_type="paper" symbol="zen3" time_len="10">
                              <phone beg_pos="74" content="z" dp_message="0" end_pos="79" is_yun="0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="5"></phone>
                              <phone beg_pos="79" content="en" dp_message="0" end_pos="84" is_yun="1" mono_tone="TONE3" perr_level_msg="2" perr_msg="0" rec_node_type="paper" time_len="5"></phone>
                          </syll>
                      </word>
                      <word beg_pos="84" content="么" end_pos="93" symbol="me5" time_len="9">
                          <syll beg_pos="84" content="么" dp_message="0" end_pos="93" rec_node_type="paper" symbol="me0" time_len="9">
                              <phone beg_pos="84" content="m" dp_message="0" end_pos="88" is_yun="0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="4"></phone>
                              <phone beg_pos="88" content="e" dp_message="0" end_pos="93" is_yun="1" mono_tone="TONE0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="5"></phone>
                          </syll>
                      </word>
                      <word beg_pos="93" content="样" end_pos="150" symbol="yang4" time_len="57">
                          <syll beg_pos="93" content="样" dp_message="0" end_pos="112" rec_node_type="paper" symbol="yang4" time_len="19">
                              <phone beg_pos="93" content="_i" dp_message="0" end_pos="96" is_yun="0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="3"></phone>
                              <phone beg_pos="96" content="iang" dp_message="0" end_pos="112" is_yun="1" mono_tone="TONE4" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="16"></phone>
                          </syll>
                          <syll beg_pos="112" content="sil" dp_message="0" end_pos="150" rec_node_type="sil" time_len="38">
                              <phone beg_pos="112" content="sil" end_pos="150" time_len="38"></phone>
                          </syll>
                      </word>
                  </sentence>
              </read_sentence>
          </rec_paper>
      </read_sentence>
  </xml_result>"
 }
}
Chinese review return parameter description
Question Types Nodes Field Information
Word and Phrase Question Types (Elementary, Adult) read_syllable
or
read_word
phone_score: voice_score
tone_score:tone_score
total_score: total_score [(phone_score + (tone_score)/2]
Word and Phrase Questions (Elementary, Adult) sentence No Important Information
Word and Phrase Questions (Elementary, Adult) word No Important Information
Word and Phrase Question Types (Elementary, Adult) syll dp_message: 0 Normal;16 Missed; 32 Increase; 64 Readback; 128 Replace;
word, word question type (primary, adult) phone dp_message: 0 normal; 16 missed; 32 incremental; 64 readback; 128 replacement (when dp_message is not 0,perr_msg may appear to keep the same value as dp_message);
mono_tone:tone type &lt ;br>perr_level_msg: return the confidence level of error checking result (total 1,2,3 three values, 1 is the best, 3 is the worst. If there is 0, it can be disregarded)
is_yun: 0 vowel, 1 rhyme:
when is_yun=0: perr_msg has two statuses: 0 vowel is correct; 1 vowel is incorrect
when is_yun=1: perr_msg has four statuses: 0 rhyme and mode are correct; 1 rhyme is incorrect; 2 mode is incorrect. 1 incorrect vowel; 2 wrong key; 3 both wrong vowel and key;
Sentence and Chapter read_sentence or read_chapter accuracy_score:accuracy
emotion_score: overall impression (whether the reading is clear and fluent, emotional, etc.)
fluency_score: fluency score
integrity_score:completeness score
phone_score:voice_score
tone_score:tonal_score
total_score
integrity_score:integrity_score
phone_score:voice_score
tone_score: tone_score
total_score: total_score [total_score =accuracy_score*0.4 + fluency_score*0.4 + overall_impression_score*0.4]. [Overall impression score*0.2]
sentence sentence phone_score:voice_score
tone_score:tonal_score
total_score: total_score[model regression]
Sentence and Piece Questions (Elementary) word no important information
Sentence Piece Question Type (Elementary) syll dp_message: 0 normal; 16 missed; 32 added; 64 readback; 128 replacement;
sentence type (elementary school) phone dp_message: 0 normal; 16 miss; 32 add; 64 readback; 128 replace (when dp_message is not 0, perr_msg may appear to be consistent with the value of dp_message);
mono_tone: tonal type <br>perr_level_msg: return the confidence level of the error checking result (total 1,2,3three values, 1 is the best, 3 is the worst. If there is 0, it can bedisregarded)
is_yun: 0 vowel, 1 rhyme:
when is_yun=0: perr_msg has twostatuses: 0 vowel is correct; 1 vowel is incorrect
when is_yun=1: perr_msg has four statuses: 0 rhyme and mode are correct; 1 rhyme is incorrect; 2 mode is incorrect. 1 incorrect vowel; 2 wrong key; 3 both wrong vowel and key;
Sentence Questions (Adults) read_sentence or read_chapter fluency_score:fluency score
integrity_score: completeness score
phone_score:voice_score
tone_score. score: tonal score
total_score: total_score [model regression]
sentence sentence phone_score: voice_score
tone_score:tonal_score
total_score: total_score [Model regression]
Sentence and Piece Questions (Adults) word No important information
Sentence Piece Question Type (Adult) syll dp_message: 0 normal; 16 missed; 32 added; 64 readback; 128 replacement;
sentence-part question type (adult) phone dp_message: 0 normal; 16 missed;32 incremental; 64 readback; 128 replacement (when dp_message is not 0, perr_msg may appear to be consistent with the value of dp_message);
mono_tone: tonal type
& gt;perr_level_msg: return the confidence level of the error checking result(total 1,2,3 three values, 1 is the best, 3 is the worst. If there is 0, it can be disregarded)
is_yun: 0 vowel, 1 rhyme:
whenis_yun=0: perr_msg has two statuses: 0 vowel is correct; 1 vowel is incorrect
when is_yun=1: perr_msg has four statuses: 0 rhyme and mode are correct; 1 rhyme is incorrect; 2 mode is incorrect. 1 incorrect vowel; 2 wrong key; 3 both wrong vowel and key;

English Review Back to Parameter Description

Question Type Node Field Information
word question type (adult) read_word [adult word] total_score: total score[model regression]
Word Problems (Adults) sentence no important information
word question type (adult) word dp_message: 0 normal; 16 missed; 32 incremental; 64 readback; 128 replacement;
total_score: score for each word
word type (adult) syll syll_score: score for each syllable
serr_msg: syllable error detection [1 or 2049, it means the reading is wrong; when serr_msg=2049, it means both the syllable and the stress are wrong]
syll_accent: rereading error detection [0, it means the syllable does not need to be reread; 1, it means the syllable needs to be reread, and the engine will not detect it; if it is 2048 or 2049, it means the reading is wrong, and the engine will parse serr_msg again. If it is 0, the syllable does not need to be reread, and the engine does not detect it; if it is 1, the syllable needs to be reread, and the serr_msg is parsed, if it is 2048 or 2049, it means the syllable is wrongly reread, and the engine is optimizing the effect, so we can not pay attention to this case].
word question type (adult) phone dp_message: 0 normal; 16 missed; 32 incremental; 64 readback; 128 replacement;
Sentence and Chapter Questions (Adult) read_sentence or read_chapter accuracy_score: accuracy_score
standard_score:standard_score
fluency_score. integrity_score: integrity score
[Adult Sentence]
total_score: total_score = (0.6*accuracy_score + fluency_score*0.3+ standard_score*0.1)* integrity_score/100
[Adult Chapter]
total_score:total_score = (0.5*accuracy_score + fluency_score*0.3 + standard_score*0.2)*integrity_score/100
Sentence and Chapter Questions (Adult) sentence accuracy_score: accuracyscore
standard_score:standard_score
fluency_score:fluency_score
[Adult Sentence]
total_score. score: total_score =(0.6*accuracy_score + fluency_score*0.3 + standard_score*0.1)
[AdultChapter]
total_score: total_score = (0.5*accuracy_score + fluency_score*0.3 + standard_score*0.2)
Sentence, Chapter Questions (Adult) word dp_message: 0 normal; 16 missed; 32 incremental; 64 back; 128 replacement;
total_score: score for each word
Pause, consecutive, rereading, end-of-sentence lifting and lowering check for error:
1. property value in xml with the binary of the Property value in the right table. (The effect of optimization, do not need to pay attention to)
2. If the result of the operation is equal to the Property value in the table above, it means that the type of detection is carried out here. If the result is not equal to the Property value in the above table, then no detection was performed here. (Effect of optimization, do not need to pay attention to)
3. Determine whether the word layer in the xml werr_msg, if it does not appear, that is, read aloud correctly. (In effect optimization, no need to pay attention to)
4. If it appears, then the value of werr_msg in xml and the corresponding value of Werr_msg in the above table will be and operation, if it is still equal to the value of this type, it means that this type of reading is wrong. (effect optimization in progress, no need to pay attention to)
Sentence and Chapter Questions (Adults) syll syll_score: score for each syllable
serr_msg: syllable check error [1 or 2049, then it means reading aloud is wrong, the effect is being optimized, you can not pay attention to this case]
Sentence, Chapter Question Type (Adult) phone dp_message: 0 normal; 16 missed; 32 incremental; 64 readback; 128 substitution;
Scenario Response rec_paper total_score: total score [model regression]
story retelling-topic rec_paper total_score: total score [model regression]
Retelling questions, oral translation, to point questions, looking at pictures rec_paper accuracy_score: accuracy score
standard_score:standard_score
fluency_score: fluency_score
integrity_score :integrity_score
total_score: total_score [model regression]
oral essay rec_paper total_score: total score [model regression]

Explanation of the format of the test questions

Explanation of the Chinese test question format

Chinese (read_syllable)

Plain text example:
(1) Without any header and without any node names
(2) Contents that can be included in the test paper: simplified Chinese characters, traditional Chinese characters (within the range of gbk), 0-9 Arabic numerals (not recommended), and separators.
(3) The separator is used between two words, and no characters other than Chinese characters and spaces should appear at the beginning and end of the line.
(4) The content of the test paper may contain 0-9 Arabic numerals, but does not support the content of the test paper is all Arabic numerals. Numerical values and strings of numbers above two digits (e.g. year, phone number, time, etc.) are required to be expressed in Chinese numerals.
(5) The number of Chinese characters in a single line should not exceed 100.

Fung, Ching, Gov.

Example of pinyin labeling:
(1) Words are separated from each other using line breaks.
(2) ü is represented by lv and nv except for lü and nü (e.g., female: nv3), which are represented by u, e.g., bureau (ju2). üe is represented by ue, e.g., slightly (lue4).
(3) Pinyin should be the correct pinyin in the dictionary, and the tone type should be taken between 0-9, where 0/5/6/7/8/9 all represent soft tones.
(4) Arabic numerals should not appear in the Chinese character part.
(5) Labeled pinyin must be given for every word in a paper with pinyin.

<customizer: interphonic>
(of an unmarried couple) be close
hao3
be (a certain color)
cheng2

Note: The total number of Chinese characters in the text of the test paper ranges (0,200], the total number of characters ranges (0,5000], and the recommended number of Chinese characters in the text ranges (0,100], and the recommended number of characters (0,200].

Chinese words (read_word)

Plain text example:
(1) Contents that can be included in the test paper: simplified Chinese characters, traditional Chinese characters (within the range of gbk), 0-9 Arabic numerals (not recommended), and separators.
(2) The separator is used between two words, and no characters other than Chinese characters and spaces should appear at the beginning and end of the line.
(3) The content of the test paper may contain 0-9 Arabic numerals, but does not support the content of the test paper is all Arabic numerals. Numerical values and strings of numbers above two digits (e.g., year, phone number, time, etc.) are required to be expressed in Chinese numerals.
(4) The number of Chinese characters in a single line should not exceed 100.

Rather, not difficult.

Example of pinyin labeling:
(1) Words are separated from each other using line breaks.
(2) What can be included in a test paper: simplified Chinese characters, pinyin, and pinyin separator (|).
(3) Pinyin should be the correct pinyin in the dictionary, and the tone type should be taken between 0-9, where 0/5/6/7/8/9 all represent soft tones.
(4) Use the "|" symbol to separate the pinyin of words within a single word.
(5) Arabic numerals should not appear in the Chinese character part.
(6) Every word in a paper with pinyin must be given a labeled pinyin.

<customizer: interphonic>
(pick) the lesser of two evils
ning4|ke3
reproof
fei1|nan4

Note: The total number of Chinese characters in the text of the test paper ranges (0,200], the total number of characters ranges (0,5000], and the recommended number of Chinese characters in the text ranges (0,100], and the recommended number of characters (0,200].

Chinese Sentence (read_sentence)

Plain text example:
(1) Contents that can be included in the test paper: simplified Chinese characters, traditional Chinese characters (within the range of gbk), 0-9 Arabic numerals (not recommended), and separators.
(2) The content of the test paper may contain 0-9 Arabic numerals, but does not support the content of the test paper is all Arabic numerals. Numerical values and strings of numbers above two digits (e.g., year, phone number, time, etc.) are required to be expressed in Chinese numerals.
(3) The number of Chinese characters in a sentence should not exceed 100.

This is an example of a Chinese statement review.

Example of pinyin labeling:
(1) Sentences are separated from each other using line breaks.
(2) What can be included in a test paper: simplified Chinese characters, pinyin, and pinyin separator (|).
(3) Arabic numerals, English words and letters should not appear in the question paper.
(4) Pinyin should be the correct pinyin in the dictionary, and the tone type should be taken between 0-9, where 0/5/6/7/8/9 all represent soft tones.
(5) Use the "|" symbol to separate pinyin from pinyin in a sentence.
(6) The number of Chinese characters in a single line should not exceed 100.
(7) Every word in a paper with pinyin must be given a labeled pinyin.

<customizer: interphonic>
How's the weather today?
jin1|tian1|tian1|qi4|zen3|me5|yang4

Note: The total number of Chinese characters in the text ranges from (0,1000], the total number of characters ranges from (0,10000], the recommended number of Chinese characters in the text ranges from [5,500], and the recommended number of characters ranges from (0,1000].

Chinese Chapter (read_chapter)

Plain text example (same as the Sentence paper, except that the chapter is made up of multiple sentences, see Sentence paper instructions for notes):

This is an example of a Chinese statement review.

Example of pinyin labeling:

<customizer: interphonic>
How's the weather today?
jin1|tian1|tian1|qi4|zen3|me5|yang4

Instructions for English test question format

English Words (read_word)

Ordinary text:
(1) Necessary node: [word], note the use of line breaks for separation.
(2) The number of words should not exceed 100.
(3) Word segmentation only supports tab key, enter line feed key and space bar.
(4) Symbols that can be supported for words: English half-width characters . - ' (i.e. dot, hyphen, upper single quote), such as p.m and year-old can be supported, hello,world is not supported.
(5) Unsupported punctuation for words: question marks, exclamation points, semicolons, colons, commas, and illegal characters ( ) [ .
(6) Do not write punctuation as a separate word in the paper (i.e., punctuation with spaces at both ends); the labeling will report an error.

[word]
apple
banana

Numerical readings are labeled:
(1) Must be marked with [number_replace] in the next line of the number.
(2) In the next line of [number_replace], the format of "number/reading/", note that the number of symbols/ must be 2, and the content of // can not be added symbols.

[word]
13
[number_replace]
13/thirteen/

Note: The content of [word] node, prohibit any character not related to the content of the word, affect the effect.
English user-defined phonetic symbols:
Users can add their own defined phonetic symbols to this node, and the engine will evaluate the word according to the phonetic symbols added by the user, regardless of what the word is really pronounced. It should be noted that when adding customized phonetic symbols you need to make sure that they are correct IFLYTEK phonetic symbols, not arbitrary ones; and it is not recommended to customize the phonetic symbols of numbers under this node.
(1) Single word symbol/number not 2 error;
(2) Error Reporting with Null Word Phonetics (//);
(3) The number of bytes of a single phonetic symbol exceeds 128*6 bytes to report an error;
(4) Multiple phonemes may be separated by the vertical line "|";
(5) At present, there is no symbol error detection function in this node, so symbols can be added to the contents of //, but it is recommended that symbols other than vertical lines and upper single quotes not be used;

[word]
lose
[vocabulary]
lose/l uw z/

English Sentence (read_sentence)

Ordinary text:
(1) Necessary node: [content], note the use of line breaks for separation.
(2) The content can be used with these four English half-width characters . ! ? ; to make clauses.
(3) The three symbols ( ) [ should not appear before or in the middle of the text.]
(4) The character [ cannot appear at the end of the text, there can be only one (or ), not more than one (or )].
(5) Support full-width characters (a full-width character takes up two bytes, the engine first turn full-width to half-width), accounting for the entire content node content byte size should not exceed 10%.
(6) The size of unsupported characters in the bytes of the entire content node should not exceed 10%, common unsupported characters such as @ , # , $ , % , & ,+ , { , }.
(7) The number of words per sentence cannot exceed 100, and the number of bytes per sentence cannot exceed 1024 bytes (clause symbols are also counted as one byte).
(8) The number of all words does not exceed 1000.

[content]
This is an example of sentence test.

With support for English half-width characters:

[content]
I don't know.

Numerical readings are labeled:
(1) The number of symbols/numbers in a single word is not 2 to report an error.
(2) Multiple readings of numbers are indicated by vertical lines separated by "|".
(3) The content must be in lowercase letters.
(4) The maximum replacement number length should not exceed 31.

[content]
I'm 13 years old.
[number_replace]
13/thirteen/

Note: If there is no special need, it is forbidden to add any information in the CONTENT text that is not related to the content of the paper, and it is forbidden to make changes to the words (such as long to l-o-n-g), which will have an impact on the grading.
Description of non-essential nodes of sentence questions:

(1) Regarding [number_replace], the number of symbols in a single word is not 2.

(2) Regarding [number_replace], the replacement content is empty to report an error (//).

(3) Regarding [number_replace], multiple readings of numbers are indicated by vertical line "|" separation.

(4) Regarding [number_replace], the content must be in lowercase letters.

(5) About [number_replace], the maximum replacement number length should not exceed 31.

(6) Regarding [vocabulary], the number of individual word symbols/numbers is not 2.

(7) Regarding [vocabulary], the word phonetic symbol is null to report an error (//).

(8) Regarding [vocabulary], the number of bytes of a single phonetic symbol exceeds 128*6 bytes.
(9) Regarding [vocabulary], polyphony can be separated by vertical lines.

English user-defined phonetic symbols:

(1) Single word symbol/number not 2 error;
(2) Error Reporting with Null Word Phonetics (//);
(3) The number of bytes of a single phonetic symbol exceeds 128*6 bytes to report an error;
(4) Multiple phonemes may be separated by vertical lines;
(5) It is recommended that the contents of // be left unsigned;

[content]
I lose my pencil today.
[vocabulary]
lose/l uw z/

Tagging requires Symphony Audio, for a phonetic cross-reference see below:

Cybernetic Phonetic Standard Phonetic Cybernetic Phonetic Standard Phonetic
aa ɑː f f
ae æ g g
ah ʌ hh h
ao ɔː jh
ar k k
aw l l
ax ə m m
ay n n
eh e ng ŋ
er ɜː p p
ey r r
ih ɪ s s
ir ɪə sh ʃ
iy t t
oo ɒ th θ
ow əʊ v v
oy ɒɪ w w
uh ʊ y j
uw z z
ur ʊə zh ʒ
b b dr dr
ch dz dz
d d tr tr
dh ð ts ts

English Chapter (read_chapter)

Example of a test paper:
(1) Necessary node: [content], note the use of line breaks for separation.
(2) The content can be used with these four English half-width characters . ! ? ; to make clauses.
(3) The three symbols ( ) [ should not appear before or in the middle of the text. ]
(4) The character [ cannot appear at the end of the text, there can be only one (or ), not more than one (or ). ]
(5) Support full-width characters (a full-width character takes up two bytes, the engine first turn full-width to half-width), accounting for the entire content node content byte size should not exceed 10%.
(6) The size of unsupported characters in the bytes of the entire content node should not exceed 10%, common unsupported characters such as @ , # , $ , % , & ,+ , { , }.
(7) The number of words per sentence cannot exceed 100, and the number of bytes per sentence cannot exceed 1024 bytes (clause symbols are also counted as one byte).
(8) The number of all words does not exceed 1000.
(9) Do not add meaningless combinations of characters in the text, such as numbers, various combinations of letters and symbols, e.g. 7FH34J.

[content]
Hello,everybody.This is an example of chapter test.

Note: If there is no special need, it is forbidden to add any information in the CONTENT text that is not related to the content of the paper, and it is forbidden to make changes to the words (such as long to l-o-n-g), which will have an impact on the grading.

English Situational Response (simple_expression)

Example of a test paper:
(1) Necessary nodes: [CHOICE], [KEYWORDS], note the use of line breaks for separation.
(2) The use of English half-width characters, ...!? ; five for clauses.
(3) The serial number of each option should be consecutive, and the serial number and the content should be written in the form of "serial number + dot + space + content".
(4) Any option needs to be displayed in one line, if the content of an option is manually changed (except the system automatically changed), resulting in the second line without a serial number, then an error will be reported.
(5) In front of each choice option text, don't appear ( ) [ these three characters in the middle, it will report an error.
(6) One (or) can appear at the end of each choice option text, not more than one (or).
(7) If you want to add full-width characters to the content of each choice option, make sure that they do not take up more than 10% of the bytes of the content of each choice node.
(8) If you want to enter unsupported characters in each choice option, make sure that their size cannot be more than 10% of the content bytes of each choice node, common unsupported characters are: @ , # , $ , % , ^, & , * , + , = , { , }.
(9) The number of words other than symbols may not exceed 100 for each CHOICE option.
(10) If there is no special need, it is forbidden to add any characters that are not related to the content of each choice option, including serial numbers, numbers, arbitrary characters, etc. The above operation will have an impact on the labeling and scoring.

[choice]

1. What should I do with the topic?
2. How can I deal with the topic?
3. What can I do with the topic?
4. What should I do with this subject?
5. How can I deal with this subject?
6. What can I do with this subject?
7. What should I do with this title?
8. How can I deal with this title?
9. What can I do with this title?
10. What should I manage this title?
11. How can I manage this title?
12. What can I manage this title?
13. What should I manage this subject?
14. How can I manage this subject?
15. What should I manage this topic?
16. How can I manage this topic?
17. What can I manage this topic?
18. How should I deal with this topic?
19. How should I deal with this title?
20. How should I deal with this subject?
[keywords]
what do topic | how deal topic | what do subject | how deal subject | what do title | how deal title | what manage title | how manage title | what manage subject | how manage subject | what manage topic | how manage topic
[script]
W: Congratulations, Tom! You gave a wonderful speech yesterday morning.
M: Thank you Mary.
W: I will give a speech next Wednesday in my English class, but I am not fully prepared yet. Can you give me some advice?
M: Sure. What's your topic?
W: Well, I am always concerned about environmental issues, so my topic is Environmental Protection.
M: This is a good topic, but it is too big.
[question]
How do I approach this topic?

[macanswer]
You have to narrow down your topic. For example, you may talk about what college students can do to protect our environment. After that, you need to do some research to collect relevant information as much as possible. Then, you should organize your arguments well. Logical organization is very important.

English multiple choice (read_choice)

Example of a test paper:
(1) Necessary nodes: [CHOICE], [KEYWORDS], note the use of line breaks for separation.
(2) The use of English half-width characters, ...!? ; five for clauses.
(3) The serial number of each option should be consecutive, and the serial number and the content should be written in the form of "serial number + dot + space + content".
(4) Any option should be displayed in one line, if the content of an option is changed to a new line, resulting in the second line without a serial number, then an error will be reported.
(5) Each CHOICE option can support full-width characters that take up no more than 10% of the size of the content bytes of the entire CHOICE node.
(6) The size of the unsupported characters of each CHOICE option as a percentage of the bytes of the entire content of the CHOICE node cannot exceed 10%.
(7) keywords content must be one of the choice options, and the correct option content must be completely continuous match, missing content can not be (and situational response question type choice node restrictions are different).
(8) Individual option answers may be separated by five English half-characters,...!? ;; for clause breaks, multiple answers can be separated by a vertical line |.
(9) The number of words other than symbols may not exceed 100 for each CHOICE option.


[choice]
1. Snakes.
2. Children.
3. Cats.
[keywords]
cats
[question]
What did the woman dislike?

English free-form questions (topic)

Example of a test paper:
(1) Necessary node: [topic], note the use of line breaks for separation.
(2) The first line of the recapitulation of the theme of the title, must be written in the following manner: "serial number + point + space + content" way of writing, such as 1. + title, must start from 1 according to the order of consecutive; note that it must be a space, can not be the tab key or other characters, the title does not appear ( ) [ the three characters, in addition, also do not in the title in the full-width characters , labeling will be an error.]
(3) The second line of the recapitulation of the content of the theme must also be written in the following way: "serial number + dot + space + content" way of writing, such as 1.1. + content, must start from 1.1.; note that it must be a space, not the tab key or other characters.
(4) If there is more than one subject content, the serial id must be consecutive, according to 1.1. , 1.2. , 1.3. this way.
(5) The use of English half-width characters, ...!? ; five for clauses.
(6) Any option needs to be displayed in one line, if the content of an option is manually changed (except the system automatically changed), resulting in the second line without a serial number, then an error will be reported.
(7) Non-essential nodes: [number_replace], [vocabulary] See Sentence Question Type Non-essential Node Restriction for specification.

[topic]

1. The Goose Thief

1.1.  Tom went to primary school in the countryside. Near his classroom, there was a small pond where two geese were raised. Students were all fond of them. One day, when Tom passed the school kitchen, he heard the cooks talking about killing the geese for the teachers' Christmas dinner. Tom got angry, and said to himself, "I won't let them be eaten!" That night, Tom worked out a plan. He was going to hide them somewhere far away from the school. The next morning, Tom went to school in his father's big coat. During the break, he rushed to the pond. Without anyone around, he caught the geese and pushed them inside the coat. However, the geese were larger than he had thought, and they tried very hard to free themselves from the coat. The big noise caught the notice of the head teacher and the students, and they all ran to the pond. The head teacher asked for an explanation. Looking at the teacher with fear, Tom told the story and said, "It is unfair to them. We all love them!" The head teacher smiled and promised not to have them killed for the Christmas dinner.
[keypoint]

1. Tom went to primary school in the countryside. Near his classroom, there was a small pond where two geese were raised.
2. Students were all fond of them.
3. One day, when Tom passed the school kitchen, he heard the cooks talking about killing the geese for the teachers' Christmas dinner.
4. Tom got angry, and said to himself, "I won't let them be eaten!" That night, Tom worked out a plan. He was going to hide them somewhere far away from the school.
5. The next morning, Tom went to school in his father's big coat. During the break, he rushed to the pond. Without anyone around, he caught the geese and pushed them inside the coat.
6. However, the geese were larger than he had thought, and they tried very hard to free themselves from the coat. The big noise caught the notice of the head teacher and the students,
7. They all ran to the pond.
8. The head teacher asked for an explanation.
9. Looking at the teacher with fear, Tom told the story and said, "It is unfair to them. We all love them!"
10. The head teacher smiled and promised not to have them killed for the Christmas dinner.

English retelling questions (retell)

Example of a test paper:
(1) Necessary nodes: [topic], [keypoint], note use newline to separate.
(2) The first line of the recapitulation of the theme of the title, must be written in the following manner: "serial number + point + space + content" way of writing, such as 1. + title, must start from 1 according to the order of consecutive; note that it must be a space, can not be the tab key or other characters, the title does not appear ( ) [ the three characters, in addition, also do not in the The full-width characters in the title , the labeling will be wrong.]
(3) The second line of the recapitulation of the content of the theme must also be written in the following way: "serial number + dot + space + content" way of writing, such as 1.1. + content, must start from 1.1.; note that it must be a space, not the tab key or other characters.
(4) If there is more than one subject content, the serial id must be consecutive, according to 1.1. , 1.2. , 1.3. this way.
(5) The use of English half-width characters, ...!? ; five for clauses.
(6) Any option needs to be displayed in one line, if the content of an option is manually changed (except the system automatically changed), resulting in the second line without a serial number, then an error will be reported.
(7) Non-essential nodes: [number_replace], [vocabulary] See Sentence Question Type Non-essential Node Restriction for specification.



[topic]

1. The Goose Thief
1.1.  Tom went to primary school in the countryside. Near his classroom, there was a small pond where two geese were raised. Students were all fond of them. One day, when Tom passed the school kitchen, he heard the cooks talking about killing the geese for the teachers' Christmas dinner. Tom got angry, and said to himself, "I won't let them be eaten!" That night, Tom worked out a plan. He was going to hide them somewhere far away from the school. The next morning, Tom went to school in his father's big coat. During the break, he rushed to the pond. Without anyone around, he caught the geese and pushed them inside the coat. However, the geese were larger than he had thought, and they tried very hard to free themselves from the coat. The big noise caught the notice of the head teacher and the students, and they all ran to the pond. The head teacher asked for an explanation. Looking at the teacher with fear, Tom told the story and said, "It is unfair to them. We all love them!" The head teacher smiled and promised not to have them killed for the Christmas dinner.
[keypoint]
1. Tom went to primary school in the countryside. Near his classroom, there was a small pond where two geese were raised.
2. Students were all fond of them.
3. One day, when Tom passed the school kitchen, he heard the cooks talking about killing the geese for the teachers' Christmas dinner.
4. Tom got angry, and said to himself, "I won't let them be eaten!" That night, Tom worked out a plan. He was going to hide them somewhere far away from the school.
5. The next morning, Tom went to school in his father's big coat. During the break, he rushed to the pond. Without anyone around, he caught the geese and pushed them inside the coat.
6. However, the geese were larger than he had thought, and they tried very hard to free themselves from the coat. The big noise caught the notice of the head teacher and the students,
7. They all ran to the pond.
8. The head teacher asked for an explanation.
9. Looking at the teacher with fear, Tom told the story and said, "It is unfair to them. We all love them!"
10. The head teacher smiled and promised not to have them killed for the Christmas dinner.

English Look and Talk (picture_talk)

Example of a test paper:
(1) Necessary node: [topic], note the use of line breaks for separation. See [topic] restriction in story retelling question type necessary node for specification.
(2) Non-essential nodes: [number_replace], [vocabulary] See Sentence Question Type Non-essential Node Restriction for specification.
(3) Regarding the non-essential node [keypoint], the serial number of each option should be consecutive, and the serial number and the content should be written in the way of "serial number + point number + space + content".
(4) Close the non-essential node at [keypoint], if there are more than one options under the keypoint node, just select the content of one of the options to slice it.


[topic]

1. Throw Litter
1.1. Mary and her classmates went outing last weekend. Someone was flying kites, some people were having snacks. There were litters on the road. Mary picked up the waste bottles and paper the put them in the dustbin. The teacher praised Mary for her good deed.
1.2. Last weekend, Mary went to the park with her classmates. They had a picnic in the park. Some people flew kites there. They had great fun there. Mary saw some rubbish on the road. She picked up the rubbish and threw it into the dustbin. The teacher praised Mary.
1.3. Last Saturday, Mary's class went to the park. They brought some food and had a picnic on the grass. After that, they flew kites there. Suddenly, Mary found that there was some rubbish on the road. She then picked up the rubbish and threw it into the dustbin. Mary's teacher saw this. She said "Well done" to Mary. Mary was very happy.
1.4. Mary went to the park with her friend last weekend. They had a picnic there, while some people were flying kites. Mary's friend wanted to fly a kite too. So she threw waste bottles and paper on the ground and ran away. Mary saw this and picked up the rubbish. Then she threw it into the garbage can. A woman noticed what Mary had done. She praised Mary for her good behavior.
1.5. Mary went to the park to have a picnic with her friend last Sunday. They brought some juice and bread as lunch. After lunch, they joined other people to fly kites. Mary saw some waste bottles and paper on the ground. Someone threw them away after having a picnic. Mary cleaned the road, putting the garbage into a garbage can. A lady saw this and praised Mary for what she had done.
1.6. Last weekend, Mary and her classmates went to the park. Some of them flew kites, and some of them had food on the grass. Mary brought some juice, bread and biscuits to share with her friend. After they finished eating, her friend went to fly a kite. Mary gathered their waste bottles and paper and was about to threw them into the dustbin. Suddenly, she saw some garbage on the ground. She picked up the garbage, and threw it away with their waste bottles and paper. Her good behavior was noticed by the manager of the park. The manager praised her.
1.7. Last weekend, Mary went outing with her classmates. Mary and her friend were having drinks and some bread. Others were flying kites or playing games. After a while, there were litters on the ground. Mary saw these and started to pick up all the waste paper and bottles. She put them into the dustbin. Mary's teacher praised her for what she had done.
1.8. Mary went for an outing with her classmates last weekend. Some people played games and some people went to fly kites. Mary and Lily were having some snacks. When they were about to play, Mary noticed that there were litters around them. So she picked up the waste bottles and paper and threw them in the dustbin. Just then, her teacher saw it and praised Mary for what she did.
1.9. The school held an outing last weekend. Mary and her classmates had fun there. Some people were playing games while some were flying kites. Mary and one of her classmates were having some snacks. Then, Mary found that there were some waste paper and bottles on the ground. So she threw all of them into the dustbin. At last, the ground became clean and Mary was praised by her teacher.
1.10. Mary and her classmates went for an outing last weekend. They were very happy. Someone was flying kites, some were having food. After having lunch, they went on playing games. Mary noticed that there were some litters on the ground. So she picked up all the litters and then put them in the dustbin. Mary's good deed was saw by her teacher. The teacher praised Mary and felt proud of what she had done.
1.11. Last Saturday, Mary's teacher took her class to an outing. The whole class were very happy then. Some people were flying kites while some were playing games. At lunch time, they had food and drank juice together. After that, there were some waste bottles and paper on the road. Mary started to pick them up and threw them into the dustbin. Her teacher saw it and spoke highly of what Mary had done. Mary felt very proud of herself.
1.12. Last weekend Mary and her classmates went outing and had a picnic. Some people were flying kites, some people were having snacks. Suddenly, they found there was a lot of litter on the road. Mary picked up the waste bottles and paper the put them in the dustbin. The teacher praised Mary for her good behavior.
1.13. Last weekend Mary went to the park with Some friends. Some of them were flying kites. Some friends were eating food. Suddenly, they saw there was some rubbish on the road. Mary picked up the rubbish and put it into the garbage. The teacher said Mary was good.
1.14.  Last weekend Mary went to the park. Some classmates were flying kites, some classmate were eating food. Suddenly, they saw there was a lot of rubbish on the road. Mary picked up the rubbish and put it into the dustbin. The teacher said Mary was a good girl.
1.15.  Last weekend Mary had a picnic with her cousins in the park. Some were flying kites, some were eating food. They saw there was some litter on the road. Mary picked up the litter and threw it into the dustbin. Her mother said Mary was good.
1.16.  Last weekend Mary had a picnic with her cousins in the park. Some flew kites, some ate food. Suddenly, they saw someone dropped a lot of litter on the road. Mary picked up the litter and threw it into the dustbin. Her mother said Mary did a good job.
1.17.  Last weekend, Mary went to the park for a picnic with her friend. They brought a lot of food and enjoyed it very much. Lily went to fly kite but she left many rubbish on the ground. Marry cleaned it and put it into the rubbish can. The teacher saw it and she said to Marry, "you are a good girl." What a good girl!

English oral translation(oral_translation)

Example of a test paper:
(1) Necessary node: [topic], note the use of line breaks for separation. See [topic] restriction in story retelling question type necessary node for specification.
(2) Non-essential nodes: [number_replace], [vocabulary] specification see Sentence Question Type Non-essential Node Restriction, [keypoint] specification see English Picture Seeing and Speaking Question Type Non-essential Node Restriction.


[topic]

1. British People
1.1.  British people usually say "hello" or "nice to meet you" and shake your hand when they meet you for the first time. They behave politely in public. They think it's rude to push in before others. They always queue. They are very polite at home as well. When in Rome, do as the Romans do. When we are in a strange place, we should do as the local people do.
1.2. For the first meeting, the English will usually say "hello" or "nice to meet you" and shake hands with you. In the public places, they behave themselves well; they think that jumping in the line is a rude behavior, so they always line up. They are often very polite at home. When we are in a strange place, do in Rome as Rome does. We should behave well as local people.
1.3. When they meet for the first time, the British usually say "hello" or "nice to meet you", and shake hands with each other. In public, they behave themselves appropriately. They think it is impolite to jump the queue, and they always wait in line patiently for their turns. They are also very polite at home. As the saying goes, "when in Rome, do as the Romans do". When we are in a strange place, we should act as the locals do.
1.4. When first meet, English are likely to say "hello" or "nice to meet you" and shake hands with you. They behave well in public. They usually line up because they think queue jumping is very impolite. And they are also very polite at home. There is an old saying "Do in Rome as Rome does". So when we are in a new place, we should behave ourselves as the locals do.
1.5. When meeting for the first time, Englishmen usually say "hello" or "nice to meet you" with a handshake. They behave themselves well in public places. They regard jumping a queue as one of the rude behavior, so they always queue up. They are also very polite at home. When in Rome, do as the Romans do. When we are in a strange place, we should behavior just like the local people.
1.6. For the first meeting, English people usually say "Hello" or "Nice to meet you" and shake hands with you. In the public place, they also act very decently. In their views, it is very impolite to cut in line. They have formed a habit to wait in a queue. At home, they are also very polite. When in a strange place, we should do in Rome as the Romans do. Moreover, it is also polite that we behave like the local people.
1.7. In first meeting, the English often say "hi" or "nice to meet you!" and then shake hands with you. In public occasions, they behave mannerly. They think jumping a queue is impolite and they always line up. Also, they are polite at home. When in Rome do as the Romans do. When we are in a strange land, we should behave like the natives.
[vocabulary]
behavior /b ih 'hh ey v y ax/
uncourteous /,ah n 'k er t ir s/

Chinese learning engine xml paraphrase

Question type: read_syllable

Description of read_syllable hierarchy fields:

Properties Annotations
phone_score acoustic_score
fluency_score fluency_score
tone_score tone_score
total_score Total Score
beg_pos/end_pos start/end position (unit: frame, each frame equals to 10ms)
content Question Paper Content
time_len time length (unit: frame, each frame is equivalent to 10ms)

Sentence Hierarchy Field Description:

Properties Annotations
time_len time length (unit: frame, each frame is equivalent to 10ms)
beg_pos/end_pos start/end position (unit: frame, each frame equals to 10ms)
content Question Paper Content

word hierarchy field descriptions:

Properties Annotations
beg_pos / end_pos start/end position (unit: frame, each frame is equivalent to10ms)
symbol Pinyin: the number represents the tone, 5 represents the soft tone
content Question Paper Content
time_len time length (unit: frame, each frame is equivalent to 10ms)

syll hierarchy field description:

Properties Annotations
beg_pos / end_pos start/end position (frame)
dp_message add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128(replace)
symbol Pinyin: the number represents the tone, 5 represents the soft tone
content Question Paper Content
rec_node_type paper(paper content),sil(non-paper content)
time_len time length (unit: frame, each frame is equivalent to 10ms)

phone hierarchy field description:

Properties Annotations
beg_pos / end_pos start/end position (unit: frame, each frame is equivalent to 10ms)
dp_message add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128(replace)
content Question Paper Content
rec_node_type paper(paper content),sil(non-paper content)
perr_msg Error message: 1(Vocalization error) 2(Modulation error) 3(Vocalization modulation error), when dp_message is not 0, perr_msg may appear to be consistent with the dp_message value
time_len time length (unit: frame, each frame is equivalent to 10ms)

Question type: read_word

Description of read_word hierarchical field:

Properties Annotations
phone_score acoustic_score

phone_score | fluency_score | fluency_score (will return 0 for now) | | tone_score | tone_score | tone_score | tone_score | tone_score | tone_score | total_score | Total Score | | beg_pos / end_pos | start/end position (unit: frame, each frame is equivalent to 10ms) | | content | Question Paper Content | | time_len | time length (unit: frame, each frame is equivalent to 10ms) |

Sentence Hierarchy Field Description:

Properties Annotations
time_len time length (unit: frame, each frame is equivalent to 10ms)
beg_pos / end_pos start/end position (unit: frame, each frame is equivalent to 10ms)
content Question Paper Content

word hierarchy field descriptions:

Properties Annotations
beg_pos / end_pos start/end position (unit: frame, each frame is equivalent to 10ms)
symbol Pinyin: the number represents the tone, 5 represents the soft tone
content Question Paper Content
time_len time length (unit: frame, each frame is equivalent to 10ms)

syll hierarchy field description:

Properties Annotations
beg_pos / end_pos start/end position (unit: frame, each frame is equivalent to 10ms)
dp_message add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128(replace)
symbol Pinyin: the number represents the tone, 5 represents the soft tone
content Question Paper Content
rec_node_type paper(paper content),sil(non-paper content)
time_len time length (unit: frame, each frame is equivalent to 10ms)

phone hierarchy field description:

Properties Annotations
beg_pos / end_pos start/end position (unit: frame, each frame is equivalent to 10ms)
dp_message add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128(replace)
content Question Paper Content
rec_node_type paper(paper content),sil(non-paper content)
perr_msg Error message: 1(Vocalization error) 2(Modulation error) 3(Vocalization modulation error), when dp_message is not 0, perr_msg may appear to be consistent with the dp_message value
time_len time length (unit: frame, each frame is equivalent to 10ms)

Question type: read_sentence

Description of read_sentence hierarchy field:

Properties Annotations
phone_score acoustic_score
fluency_score fluency_score
tone score tone type score
total score total score
beg_pos/end_pos start/end position (unit, frame, each frame equals to 10ms)
content Question Paper Content
time_len time length (unit: frame, each frame is equivalent to 10ms)

Sentence Hierarchy Field Description:

Properties Annotations
phone_score acoustic_score
fluency_score fluency_score
tone_score tone_score
total_score Total Score
beg_pos/end_pos start/end position (unit, frame, each frame equals to 10ms)
content Question Paper Content
time_len time length (unit: frame, each frame is equivalent to 10ms)

word hierarchy field descriptions:

Properties Annotations
beg_pos/end_pos start/end positions (frames)
symbol Pinyin: the number represents the tone, 5 represents the soft tone
content Question Paper Content
time_len time length (unit: frame, each frame is equivalent to 10ms)

syll hierarchy field description:

Properties Annotations
beg_pos / end_pos start/end position (frame)
dp_message add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128(replace)
symbol Pinyin: the number represents the tone, 5 represents the soft tone
time_len time length (unit: frame, each frame is equivalent to 10ms)
content Question Paper Content
rec_node_type paper(paper content),sil(non paper content)
time_len time length (unit: frame, each frame is equivalent to 10ms)

phone hierarchy field description:

Properties Annotations
beg_pos / end_pos start/end position (frame)
dp_message incremental message, 0 (correct) 16 (missed) 32 (incremental) 64(readback) 128 (replacement)
content Question Paper Content
rec_node_type paper(paper content),sil(non-paper content)
content Question Paper Content
perr_msg Error message: 1 (wrong voice) 2 (wrong key pattern) 3 (wrong keypattern of voice), when dp_message is not 0, perr_msg may appear to be consistent with the value of dp_message
time_len time_length (unit, frames, each frame equals 10ms)

Question type: read_chapter

Description of read_chapter hierarchy fields:

Properties Annotations
phone_score acoustic_score
fluency_score fluency_score
tone_score tone_score
total_score Total Score
beg_pos / end_pos start/end position (frame)
content Question Paper Content
time_len time_length (unit, frames, each frame equals 10ms)

Sentence Hierarchy Field Description:

Properties Annotations
phone_score acoustic_score
fluency_score fluency_score
tone_score tone_score
total_score Total Score
beg_pos / end_pos start/end position (frame)
content Question Paper Content
time_len time_length (unit, frames, each frame equals 10ms)

word hierarchy field descriptions:

Properties Annotations
beg_pos / end_pos start/end position (frame)
symbol Pinyin: the number represents the tone, 5 represents the soft tone
content Question Paper Content
time_len time length (unit: frame, each frame is equivalent to 10ms)

syll hierarchy field description:

Properties Annotations
beg_pos / end_pos start/end position (frame)
dp_message add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128(replace)
symbol Pinyin: the number represents the tone, 5 represents the soft tone
content Question Paper Content
rec_node_type paper(paper content),sil(non-paper content)
time_len time length (unit: frame, each frame is equivalent to 10ms)

phone hierarchy field description:

Properties Annotations
beg_pos / end_pos start/end position (frame)
dp_message add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128
(replace)
content Question Paper Content
rec_node_type paper(paper content),sil(non-paper content)
perr_msg Error message: 1 (wrong voice) 2 (wrong key pattern) 3 (wrong key pattern of voice) , when dp_message is not 0, perr_msg may appear to be consistent with the value of dp_message
time_len time length (unit: frame, each frame is equivalent to 10ms)

Learning engine xml output table one

Question type: read_word

Read_word layer description:

Properties Annotations
beg_pos Multiple word start boundary time
content Multi-word content
end_pos Multiple Word End Boundary Time
accuracy_score accuracy_score
standard_score standardized_score
except_info Exception Information
is_rejected whether or not it was rejected
total_score average of total scores for multiple words

Sentence (sentence) level description:

Properties Annotations
beg_pos Multiple word start boundary time
content sentence content
end_pos end-of-sentence boundary time
index sentence index

Description of the word layer

Properties Annotations
beg_pos word start boundary time
content word content
end_pos Word End Boundary Time
dp_message Word Increment Missed Message
global_index Word in chapter index
index Word in sentence index
property word properties
total_score word total
pitch word base frequency information (reserved field, don't need to care)
pitch_beg Word Base Frequency Beginning Value
pitch_end word base frequency end value
werr_msg Give result for wrong word (correct not output)

Syll(syllable) layer description:

Properties Annotations
beg_pos Beginning of syllable boundary time
content syllabic content
end_pos syllable end boundary time
serr_msg syllable error message
syll_accent syllable repetition markers

Phoneme layer description:

Properties Annotations
beg_pos Phoneme start boundary time
content phoneme content
end_pos phoneme end boundary time
dp_message phoneme incremental miss message

Question type: read_ sentence

read_chapter (chapter) layer description:

Properties Annotations
accuracy_score accuracy_score
beg_pos chapter start time
content Chapter Content
end_pos end of chapter
except_info Exception Information
fluency_score fluency_score
integrity_score integrity_score
standard_score standard_score
is_rejected whether or not it was rejected
total_score Total Score
word_count number of words in the chapter

Sentence level description:

Properties Annotations
beg_pos sentence start boundary time
content sentence content
end_pos end-of-sentence boundary time
accuracy_score accuracy_score
fluency_score fluency_score
standard_score standard_score
index sentence index
score(replace with total_score) total_score, struct (hidden)
word_count sentence all word count

Word layer description:

Properties Annotations
beg_pos word start boundary time
content word content
end_pos Word End Boundary Time
dp_message word incremental omission message
global_index Word in chapter index
index word in sentence index
property word properties
total_score word total
pitch word base frequency information (reserved field, don't need to care)
pitch_beg Word Base Frequency Beginning Value
pitch_end word base frequency end value
werr_msg Give result for wrong word (correct not output)

Syll(syllable) layer description:

Properties Annotations
beg_pos Beginning of syllable boundary time
content syllabic content
end_pos syllable end boundary time
serr_msg syllable error message
syll_accent syllable repetition markers

Phoneme

Properties Annotations
beg_pos Phoneme start boundary time
content phoneme content
end_pos phoneme end boundary time
dp_message phoneme incremental miss message

Question type: read_chapter

read_chapter (chapter) layer description:

Properties Annotations
accuracy_score accuracy_score
beg_pos chapter start time
content Chapter Content
end_pos end of chapter
except_info Exception Information
fluency_score fluency_score
integrity_score integrity_score
standard_score standard_score
is_rejected whether or not it was rejected
total_score Total Score
word_count number of words in a chapter

Sentence (sentence) level description:

Properties Annotations
beg_pos sentence start boundary time
content sentence content
end_pos end-of-sentence boundary time
accuracy_score accuracy_score
fluency_score fluency_score
standard_score standard_score
index sentence index
score(replace with total_score) total_score, struct (hidden)
word_count sentence all word count

Word layer description:

Properties Annotations
beg_pos word start boundary time
content word content
end_pos Word End Boundary Time
dp_message word incremental omission message
global_index Word in chapter index
index word in sentence index
property word properties
total_score word total
werr_msg Give result for wrong word (correct not output)

Syll(syllable) layer description:

Properties Annotations
beg_pos Beginning of syllable boundaries
content syllabic content
end_pos syllable end boundary time
serr_msg syllable error message
syll_accent syllable repetition markers

Phoneme layer description:

Properties Annotations
beg_pos Phoneme start boundary time
content phoneme content
end_pos phoneme end boundary time

Type of question: topic (free-response questions in English)

Description of rec_paper layer:

Properties Annotations
accuracy_score semantic accuracy score
beg_pos start time of reading aloud
content read aloud recognized content
end_pos end time of reading aloud
except_info Exception Information
phone_score Pronunciation accuracy score
speeking_speed speed of speech (typically 140-200 words per minute)
total_score Total Score

Sentence Layer Description:

Properties Annotations
content sentence content
index sentence index

Word Layer Description:

Properties Annotations
beg_pos Word Start Boundary Time
content word content
end_pos Word End Boundary Time

Question type: simple_expression (English situational response)

rec_paper layer description:

Properties Annotations
beg_pos start time of reading aloud
content read aloud recognized content
end_pos end time of reading aloud
except_info Exception Information
phone_score Pronunciation accuracy score
total_score Total Score

Sentence Layer Description:

Properties Annotations
content sentence content
index sentence index

word layer description:

Properties Annotations
beg_pos Word Start Boundary Time
content word content
end_pos Word End Boundary Time

Question type: read_choice (English multiple choice)

free_choice layer description:

Properties Annotations
beg_pos start time of reading aloud
content read aloud recognized content
end_pos end time of reading aloud
except_info Exception Information
total_score Total Score

Learning engine xml output table II

Notes and additional information

Precautions Description
is_rejected return field (some assessment question types do not have this field returned) true: rejected, indicating that the engine detected garbled reads and that the score cannot be used as a reference
false: normal
Standardized Degree Scores in Word, Sentence, and Chapter Question Types Standardized Degree Scores only if the number of words in the text is >= 5.
Gambling Detection for Word, Sentence, and Chapter Question Types Gambling Detection is only available if the number of words in the text is >=5. (There is currently no gibberish detection for free-form questions.)
except_info attribute value except_info=28673 in hexadecimal is 0x7001, indicating that the engine judges the voice to be of no voice or low volume type
except_info=28676 in hexadecimal is 0x7004, indicating that the engine judges the voice to be of gibberish type <br&gt except_info=28680, hexadecimal is 0x7008, means the engine judges the voice as low signal-to-noise ratio type
except_info=28690, hexadecimal is 0x7012, means the engine judges the voice as truncated type
except_info=28689 When accept_info=28689, the hexadecimal value is 0x7011, which means the engine judges that there is no audio input, please check if the audio or recording equipment is normal
dp_message attribute value dp_message=0 means that the engine judges that the word or phoneme was read normally
dp_message=16 means that the engine judges that the word or phoneme was missed
dp_message=32 means that the engine judges that the word or phoneme was incremented
property, werr_msg property (effect optimization, no need to pay attention to) werr_msg property will appear only when the engine judge the word read wrong,for example, the word property = 16, that the word at the need to read, if the xmlappears property werr_msg = 512, it shows that the engine judge the voice of thisword at the word is not even read! Otherwise, the engine reads the word correctly.
Consecutive reading: property=16; werr_msg=512
Repetition:property=32; werr_msg=2048 End-of-sentence intonation and intonation:property=64; werr_msg=4096
Intentional pauses: property=2; werr_msg=2;werr_msg=4096
Implied pauses. 2; werr_msg=256
Half-sentence:property=12, when text words are followed by a single comma sign, property=12,this is the engine's clause marker, this property appears for words before the clause-comma sign within a sentence, indicating that it's a marker for a half-sentence, with no special meaning.
serr_msg attribute serr_msg=0, the engine judges that the syllable is read correctly
serr_msg=1, the engine judges that the syllable is read incorrectly
serr_msg=2048, the syllable needs to be reread but the engine judges that the voice has not been reread (at this time, syll_accent is 1, and we recommend not to pay attention to this situation)
serr_msg=2049, the syllable needs to be reread but the engine judges that the voice has not been reread and we recommend not to pay attention to this situation. accent=1, effect optimization,it is not recommended to pay attention to this situation)
serr_msg=2049, itmeans that the syllable needs to be read again (zhong) but the engine judges thatthe voice is not reread and the syllable is read wrongly (at this time, the syll_accent should be 1, effect optimization, it is possible not to pay attention to this case)
syll_accent attribute syll_accent=0 means this syllable does not need to bereread
syll_accent=1 means this syllable needs to be reread
Some of the pinyin papers, e.g., When <zai4>Da<da2>Rui was eight years old,one day he wanted to <xiang3>go to the movies. Add pinyin labeling for no more than one-third of the number of characters in the entire paper.
some question types such as words, phrases, sentences, chapters at the syll level,phone level content paper content occurs (sil and silv for silence, fil for noise)
gwpp, pitch, reject_type, no_plo_word, dur_value, magnitude_value, pitch_value,score_pattern These fields are reserved fields returned by the model and are of no concern

Error code

Error code Error code description
10163 Parameter validation failure, caused by a client parameter validation failure,the client needs to change the request parameters based on the description in the returned message field
10313 Request parameter No app_id passed in first frame or app_id passed does not match api_key.
40007 Audio decoding failed, please check if the transmitted audio corresponds to the encoding format described in the encoding field.
40007 Audio decoding failed.
11201 Interface usage has exceeded the maximum limit for purchase.
11201 Interface Usage Exceeds Maximum Purchase Limit
10114 Request timed out, session time exceeded 300s, please control the session time to keep it no longer than 300s
10043 Audio decoding failed, please make sure the transmitted audio encoding format is consistent with the request parameters.
10043 Audio decoding failed.
10161 base64 decoding failed, check if the sent data is encoded in base64
10200 Read data timeout, check if no data has been sent for a total of 10s and the connection has not been closed
10160 Illegal request data format, check if request data is legal json
11200 Function unauthorized
60114 Review Audio Length Too Long
10139 Parameter error
48196 instance prohibits repeated calls to this interface
40006 Invalid parameter
40010 no response
40016 Initialization failure
40017 Not initialized
40023 Invalid configuration
40034 Parameter not set
40037 no review text
40038 no review voice
40040 Illegal data
42306 Insufficient authorization
68676 Nonsense
30002 ssb no cmd parameter
48195 Example assessment test paper is not set, the format of the test questions is wrong, please check whether the assessment text matches with the test questions,especially the English question types need to add special markings in the test questions, not set ent, category and other parameters, etc
30011 sid is empty, if upload audio is not set aus
68675 Unusual voice data, check for 16k, 16bit, mono audio, and check that the aue parameter value designation matches the audio type
48205 Example not evaluated, e.g. no recordings fetched, error due to empty ploaded audio

Demos

Note: demo is just a simple call example, not suitable for direct use in complex and changing production environments

Pronunciation Assessment Streaming API demo java language

Pronunciation Assessment Streaming API demo python3 language

Pronunciation Assessment Streaming API demo go language

Pronunciation Assessment Streaming API demo nodejs language

Pronunciation Assessment Streaming API demo C# language

Frequently Asked Questions

What are the scoring criteria for Pronunciation Assessments?

Dimensions Percentage of Adults Percentage of Elementary School Students A (9-10 points)
Accuracy 50% 60% Word pronunciation accurate and clear
Fluency 30% 30% Reads aloud fluently, speaks at a normal pace, and exhibits essentially no pauses, repetitions, self-corrections, etc.
Standardness (including emotion) 20% 10% Pronunciation habits in line with native English standards (no Chinese accent), flexible use of pronunciation techniques such as alliteration, repetition, dissonance, and bursting, good rhythm, and full of emotion

How many concurrent channels does the Pronunciation Assessment Web api support?

Answer: 50-way concurrency supported by default

How long does Pronunciation Assessment support voice input at most?

A: For all assessment question types, it is recommended to use voice input for less than 3 minutes, if the audio sending session lasts more than 5 minutes it will report error 10114 or 60114 error.

What are the audio requirements for Pronunciation Assessment support?

A: The audio sampling rate is 16k, sampling precision 16 bit, mono audio. For sample audio, please refer to the audio provided in the java demo.

What is the difference between the new streaming version of the review and the previous regular version of the review (which has been taken offline)?

A: The main differences are:
1, the new version of the streaming evaluation has adopted a new structure, in product features, evaluation results, service stability and other aspects of the overall superiority of the ordinary version of the evaluation;
2, the new version of the streaming assessment support more question types, in addition to the ordinary version of the support of words and phrases, chapters and other question types, but also supports such as the English situation response, free to say, look at the picture to speak, oral composition and other question types (note that such question types need to be combined with the customization of the test paper service, please check the product details page for the corresponding package description);
3, the new version of the streaming evaluation using the new architecture, temporarily only support the return of xml format results, json format will be supported in the near future, stay tuned;
4, the new version of the streaming version of the evaluation using websocket protocol, the ordinary version of the evaluation is based on http protocol, access to different ways, please refer to the development of detailed documentation and sample code integration development.

How do MSCs with the old SDK for Pronunciation Assessment (Normal Edition) switch to use the Pronunciation Assessment (Streaming Edition) interface capability?

Answer: The parameters need to be modified as follows:
1. Set the mandatory parameter sub=ise;
2. The Chinese setting must transmit the parameter ent=cn_vip, and the English setting must transmit the parameter ent=en_vip;
3. Add the two mandatory parameters as above to complete the use of the Pronunciation Assessment (streaming version) interface capability;

How to solve the problem of getting high marks for talking and reading nonsense*

A: The evaluation result will give is_rejected field, when the value of the field is true, it means that at this time it is a rejection caused by the user's nonsense, and the developer can judge whether the user is nonsense this time according to this field.If the engine reports nonsense, then it can be assumed that the scoring has become untrustworthy. The cause of the garbled reading can be initially determined from the except_info attribute value.