You can install the package via composer:
$ composer require oblak/syllabizer<?php
require __DIR__ . '/vendor/autoload.php';
use Oblak\Syllabizer;
$syllabizer = new Syllabizer();
$syllabizer->syllabize('jednak'); // ['jed', 'nak']
$syllabizer->syllabize('tramvaj'); // ['tram', 'vaj']
$syllabizer->syllabize('pidžama'); // ['pi', 'dža', 'ma']
$syllabizer->syllabize('mačka'); // ['ma', 'čka']
// Cyrillic works just as well
$syllabizer->syllabize('сломљен'); // ['слом', 'љен']
// Syllabic R (slogotvorno r) is a nucleus of its own
$syllabizer->syllabize('brzo'); // ['br', 'zo']
$syllabizer->syllabize('rđa'); // ['r', 'đa']
// Count the syllables
count($syllabizer->syllabize('slogovnik')); // 3syllabize() accepts a string or any Stringable, and returns an ordered array
of syllables. Joining the result reproduces the original word exactly:
$word = 'doneti';
implode('', $syllabizer->syllabize($word)) === $word; // truetokenize() is a convenience wrapper that returns the syllables as a single string,
joined by a separator (a hyphen by default):
$syllabizer->tokenize('doneti'); // 'do-ne-ti'
$syllabizer->tokenize('сломљен'); // 'слом-љен'
// Pass any separator you like
$syllabizer->tokenize('doneti', '·'); // 'do·ne·ti'The library follows the standard pedagogical rules for Serbian syllabification:
- Both scripts — Latin and Cyrillic input are supported. The Latin digraphs
lj,njanddž(in any case) count as a single consonant and are never split, just like their Cyrillic counterpartsљ,њ,џ. - Vowels carry syllables — the number of syllables equals the number of vowels
(
a e i o u), plus any syllabic R. - Syllabic R — an
rwith no neighbouring vowel (between consonants, or word‑initial before a consonant) becomes a syllable nucleus:pr‑st,tr‑ka,r‑vač. - Consonant clusters — a single consonant opens the following syllable
(
li‑va‑da); within a cluster the boundary falls between two sonants (or‑la,tram‑vaj) or between a plosive and a following non‑approximant (lop‑ta,sred‑stvo); otherwise the whole cluster opens the next syllable (la‑sta,je‑dva,sve‑tlost).
$ composer test$ composer lint # check
$ composer lint:fix # auto-fixThe MIT License (MIT). Please see the License File for more information.