The one-million-word Lancaster Corpus of Mandarin Chinese (LCMC) has a Pinyin version and a character version. This corpus is freely available for academic and education use from the European Association and Language Resources. Just just for the ELRA (ELDA catalog No. W0037). It is also available from the Oxford Text Archive (OTA).