NxNlp 1.0.43
NxNlp
Next-generation .NET NLP library for lexicon management, WordNet, spell checking, language detection, text segmentation, POS tagging, and more. Supports 66 languages with modular data packages.
Features
- Dictionary Lookup -- WordNet 3.0, Webster 1913, CC-CEDICT, Open Multilingual Wordnet (19 languages), AI-generated concise dictionaries
- Language Detection -- NTextCat (82 languages) + Catalyst neural detector fallback
- POS Tagging & Tokenization -- OpenNLP 2.5 Universal Dependencies models (36 languages) + ONNX transformer models
- Spell Checking -- Hunspell dictionaries for 56 languages
- Chinese NLP -- Jieba word segmentation, CC-CEDICT lookup, Chinese WordNet
- Japanese NLP -- MeCab morphological analysis, Japanese WordNet
- EPUB Parsing -- Extract text, metadata, TOC, and spine from EPUB e-books
- LLM Integration -- OpenAI-powered article generation, translation, word explanation
- Runtime Data Management -- On-demand download and installation of data packages from NuGet feed
Quick Start
Installation
# Core library
dotnet add package NxNlp --source https://nuget.easyrote.net/v3/index.json
# Language pack (includes all NLP data for a language)
dotnet add package NxNlp.LangPack.en --source https://nuget.easyrote.net/v3/index.json
dotnet add package NxNlp.LangPack.zh --source https://nuget.easyrote.net/v3/index.json
Or add the feed to your NuGet.Config:
<configuration>
<packageSources>
<add key="nuget.org" value="https://api.nuget.org/v3/index.json" />
<add key="nxnlp" value="https://nuget.easyrote.net/v3/index.json" />
</packageSources>
</configuration>
Language Detection
using NxNlp.Text;
var detector = new LangDetector(NxNlpDataPath.LangDetectDir);
var lang = await detector.DetectLangAsync("This is an English sentence.", CancellationToken.None);
// lang == "en"
var lang2 = await detector.DetectLangAsync("これは日本語の文です。", CancellationToken.None);
// lang2 == "ja"
Runtime Data Management
using NxNlp.DataManager;
var http = new HttpClient();
var feed = new NuGetFeedClient(http, "https://nuget.easyrote.net/v3/index.json");
var manager = new DataPackageManager(feed);
// List available packages
foreach (var pkg in manager.GetAvailablePackages())
Console.WriteLine($"{pkg.PackageId}: {pkg.Description}");
// Install a data package at runtime
var progress = new Progress<DownloadProgress>(p =>
Console.WriteLine($"{p.Phase}: {p.BytesDownloaded}/{p.TotalBytes}"));
await manager.InstallAsync("NxNlp.Data.Webster", progress);
// Query by feature
var required = manager.GetRequiredPackages("dictionary-wordnet");
Language Pack Discovery
using NxNlp.Common;
// List installed language packs
var languages = NxNlpLangPackManager.GetInstalledLanguages();
// e.g. ["en", "ja", "zh"]
// Check what's in a language pack
var contents = NxNlpLangPackManager.GetContents("zh");
// contents.HasUdModel, contents.HasHunspell, contents.HasOpenNlp25,
// contents.HasOmw, contents.HasJieba, contents.CcDictIds
// Get specific data paths
var udModel = NxNlpLangPackManager.GetUdModelPath("en");
var hunspell = NxNlpLangPackManager.GetHunspellPath("en");
var ccDict = NxNlpLangPackManager.GetCcDictPath("en", "en");
Data Packages
NxNlp uses a dual-track packaging system:
Module Packages (feature-scoped)
Install individual features as needed:
| Package | Description | Size |
|---|---|---|
NxNlp.Data.OpenNlp25 |
OpenNLP 2.5 UD models (36 languages) | ~50 MB |
NxNlp.Data.WordNet |
WordNet 3.0 English dictionary | ~30 MB |
NxNlp.Data.Webster |
Webster's Unabridged Dictionary (1913) | ~45 MB |
NxNlp.Data.CeDict |
CC-CEDICT Chinese-English dictionary | ~11 MB |
NxNlp.Data.CcDict.* |
AI-generated concise dictionaries (per language pair) | 3--27 MB each |
NxNlp.Data.Lexicons |
EasyRote internal lexicon databases | ~50 MB |
NxNlp.Data.Jieba |
Chinese word segmentation data | ~15 MB |
NxNlp.Data.NMeCab |
Japanese MeCab morphology dictionary | ~14 MB |
NxNlp.Data.OpenNlp |
Language detection profiles (NTextCat) | ~20 MB |
NxNlp.Data.Onnx |
ONNX transformer POS models | ~1.5 GB |
NxNlp.Data.Omw.* |
Open Multilingual Wordnet (28 packages) | 1--25 MB each |
Language Packs (language-scoped)
One package per language, bundles everything needed for that language:
dotnet add package NxNlp.LangPack.en # English
dotnet add package NxNlp.LangPack.zh # Chinese
dotnet add package NxNlp.LangPack.ja # Japanese
dotnet add package NxNlp.LangPack.de # German
# ... 66 languages total
Each language pack may include:
| Component | Description | Inclusion |
|---|---|---|
| Hunspell dictionary | Spell checking (.aff/.dic) | If available for language |
| UD model | POS tagging, tokenization | If available for language |
| OpenNLP 2.5 models | Sentence detection, tokenization | If available for language |
| OMW WordNet | Multilingual WordNet synonyms | If available for language |
| CC Dict (monolingual) | AI dictionary ( -> ) | If available |
| CC Dict (to English) | AI dictionary ( -> en) | If available |
| Jieba | Chinese segmentation | zh only |
| NMeCab | Japanese morphology | ja only |
Data Path Resolution
NxNlp looks for data files in this order:
NXNLP_DATA_DIRenvironment variableNxNlp.Data/under current working directory (NuGet contentFiles output)- Walk up directory tree to find
AppData/_NxNlpData/ - Legacy EasyOneX6 layout (backward compatibility)
To override, set the environment variable:
export NXNLP_DATA_DIR=/path/to/nxnlp/data
Requirements
- .NET 10.0 or later
- Git LFS (for cloning the repository -- OpenNlp25 and Onnx models)
NuGet Feed
All packages are hosted on:
https://nuget.easyrote.net/v3/index.json
License
Copyright EasyRote. All rights reserved.
Showing the top 20 packages that depend on NxNlp.
| Packages | Downloads |
|---|---|
|
NxNlp.Cli
NxNlp CLI - Command-line tools for lexicon building and dictionary management.
|
6 |
|
NxReader.NlpPlugins
NxReader NLP Plugins - Natural language processing plugins for the reader
|
5 |
|
NxReader.Library
NxReader Library - shared library and document management services and UI
|
5 |
|
NxNlp.Onnx
ONNX Runtime transformer model support for NxNlp NLP pipeline.
|
4 |
|
NxReader.NlpPlugins
NxReader NLP Plugins - Natural language processing plugins for the reader
|
4 |
|
NxReader.Library
NxReader Library - shared library and document management services and UI
|
4 |
.NET 10.0
- AngleSharp (>= 1.3.0)
- Serilog.Extensions.Logging (>= 9.0.2)
- OpenAI (>= 2.3.0)
- Nito.AsyncEx (>= 5.1.2)
- Newtonsoft.Json (>= 13.0.3)
- NTextCat (>= 0.3.65)
- Microsoft.Recognizers.Text.DateTime (>= 1.8.13)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.0)
- Microsoft.Data.Sqlite (>= 10.0.0)
- MessagePack (>= 3.1.4)
- JetBrains.Annotations (>= 2025.2.2)
- HtmlAgilityPack (>= 1.12.2)
- FuzzySharp (>= 2.0.2)
- AngleSharp.Xml (>= 1.0.0)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 10.0.0)