This research project was a collaboration between TransformELT and the Chinese Basic Education Curriculum and Teaching Material Research Center (BECTMRC). 

The project sought to identify:

  1. the most commonly used medium to high frequency, age-appropriate language chunks that are presented in the updated 2021 New National English Curriculum (NNEC) (covering Grades 3 to 9) through comparison with commonly used, age-appropriate lexical chunks used by similarly aged native-speaking children in the UK.
  2. prominent gaps in high-frequency language within the New National English Curriculum (NNEC) that can be supplemented or included in future materials revision.

The project team created a corpus of native speaking children’s output, the language produced by children. Large samples of language input written for the NNEC age group, (i.e. 9 to 15), were also collected, as children’s language output is strongly influenced by the language they encounter in written texts. Project outputs take the form of an online database and a ‘book of chunks’ (with and without metadata).

The research outputs will enable Chinese curriculum designers and materials writers for Grades 3 to 9 to verify their intuitions about the language used by similarly aged native-speaking children in the UK, and also to enhance the NNEC word lists with vocabulary items that will bring their teaching materials into closer alignment with native youth English.

Please follow the provided link to the TransformELT website. There you can find:

  1. Book of chunks with metadata
  2. Book of chunks without metadata
  3. Top 400 Lemmas taken from the corpus data
  4. Corpus database