NWJC2Vec

Masayuki Asahara

doi:10.1075/term.00011.asa

What is it about?

In this paper, we present a word embedding dataset NWJC2Vec constructed using ‘NINJAL Web Japanese Corpus (NWJC)’. NWJC is a Web-crawled text corpus that contains 25.8 billion tokens. We construct two types of the word embedding dataset: one is based on the surface form, and the other is based on the complete morpheme information provided by UniDic, which is a lexicon for the Japanese morphological analyser MeCab. We perform an evaluation of the dataset by comparing it with the ‘Word List by Semantic Principles (Bunrui Goihyo)’.

This page is a summary of: NWJC2Vec, Terminology International Journal of Theoretical and Applied Issues in Specialized Communication, May 2018, John Benjamins,
DOI: 10.1075/term.00011.asa.
You can read the full text:

Read

Contributors

The following have contributed to this page

Masayuki Asahara
National Institute for Japanese Language and Linguistics, Japan

NWJC2Vec Word embedding dataset from ‘NINJAL Web Japanese Corpus’

What is it about?

Contributors

You might also like

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

NWJC2Vec Word embedding dataset from ‘NINJAL Web Japanese Corpus’

What is it about?

Featured Image

Read the Original

Contributors

Share this page:

You might also like

A comparative study of how and why in Taiwan Southern Min and Mandarin Chinese

The Early Development of Emotional Competence Profile: A Means to Share Information About Emotional Status and Expression by Children With Complex Communication Needs

Direct Speech-Language Intervention Effects on Augmentative and Alternative Communication System Use in Adults With Developmental Disabilities in a Naturalistic Environment

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management