Overview

Namespaces

  • PHP
  • Sastrawi
    • Dictionary
    • Morphology
      • Disambiguator
    • Specification
    • Stemmer
      • Cache
      • ConfixStripping
      • Context
        • Visitor
      • Filter
    • StopWordRemover

Classes

  • TextNormalizer
  • Overview
  • Namespace
  • Class
  • Tree
 1: <?php
 2: /**
 3:  * Sastrawi (https://github.com/sastrawi/sastrawi)
 4:  *
 5:  * @link      http://github.com/sastrawi/sastrawi for the canonical source repository
 6:  * @license   https://github.com/sastrawi/sastrawi/blob/master/LICENSE The MIT License (MIT)
 7:  */
 8: 
 9: namespace Sastrawi\Stemmer\Filter;
10: 
11: /**
12:  * Class for normalize text before the stemming process
13:  */
14: class TextNormalizer
15: {
16:     /**
17:      * Removes symbols & characters other than alphabetics
18:      *
19:      * @param  string $text
20:      * @return string normalized text
21:      */
22:     public static function normalizeText($text)
23:     {
24:         $text = strtolower($text);
25:         $text = preg_replace('/[^a-z0-9 -]/im', ' ', $text);
26:         $text = preg_replace('/( +)/im', ' ', $text);
27: 
28:         return trim($text);
29:     }
30: }
31: 
API documentation generated by ApiGen 2.8.0