Sphinx

未匹配的标注

Sphinx(音译:斯芬克斯)

Yii 2 官方 yii2-sphinx 扩展为框架赋予了 Sphinx 全文搜索引擎扩展,它支持包括 实时索引(Real-time Indexes) 在内的所有 Sphinx 特性。

安装

Sphinx 一般是配合 mysql 一起使用。Sphinx 服务端 下载安装 版本,最低要求 >= 2.0,然而,为了使用所有扩展特性,Sphinx 版本要求 2.2 或更高。

安装此扩展的首选方式是通过 composer,命令如下:

composer require yiisoft/yii2-sphinx

或者在您的 composer.json 文件中的 require 部分增加以下内容:

"yiisoft/yii2-sphinx": "~2.0.0"

然后执行 composer install 命令即可。

配置

这个扩展使用 MySQL 协议和 SphinxQL(SQL 的一种方言)查询语言与 Sphinx 搜索守护进程交互。为了设置 Sphinx “searchd” 支持 MySQL 协议,您需要添加以下配置:

searchd
{
    listen = localhost:9306:mysql41
    ...
}

要使用这个扩展,只需在您的应用程序配置中添加以下代码:

return [
    //....
    'components' => [
        'sphinx' => [
            'class' => 'yii\sphinx\Connection',
            'dsn' => 'mysql:host=127.0.0.1;port=9306;',
            'username' => '',
            'password' => '',
        ],
    ],
];

基本用法

由于这个扩展使用 MySQL 协议访问 Sphinx,它从常规的“yii\db”包里共享基本方法和很多代码。运行 SphinxQL 查询非常类似于常规的 SQL 查询:

$sql = 'SELECT * FROM idx_item WHERE group_id = :group_id';
$params = [
    'group_id' => 17
];
$rows = Yii::$app->sphinx->createCommand($sql, $params)->queryAll();

您还可以使用查询生成器:

use yii\sphinx\Query;

$query = new Query();
$rows = $query->select('id, price')
    ->from('idx_item')
    ->andWhere(['group_id' => 1])
    ->all();

Note:默认情况下,Sphinx将任何查询返回的记录数量限制为 10 条。如果您需要获得更多的记录,您应该明确地指定 limit 的值。

组装 MATCH 语句

Sphinx 用法的意义在于它的全文搜索功能,在 SphinxQL 中,它是通过 ‘MATCH’ 语句提供的。你总是可以手动组合它作为“where”条件的一部分,但如果你使用的是 yii\sphinx\Query,你可以通过 yii\sphinx\Query::match() 来实现,示例如下:

use yii\sphinx\Query;

$query = new Query();
$rows = $query->from('idx_item')
    ->match($_POST['search'])
    ->all();

请注意,Sphinx ‘MATCH’ 语句参数,使用复杂的内部语法来进行更好的调优。默认情况下,yii\sphinx\Query::match() 将转义与该语法相关的所有特殊字符。因此,如果你想使用复杂的 ‘MATCH’ 语句,你应该使用 yii\db\Expression,例如:

use yii\sphinx\Query;
use yii\db\Expression;

$query = new Query();
$rows = $query->from('idx_item')
    ->match(new Expression(':match', ['match' => '@(content) ' . Yii::$app->sphinx->escapeMatchValue($_POST['search'])]))
    ->all();

Note:如果你构建 MATCH 参数,请确保使用 \yii\sphinx\Connection::escapeMatchValue() 来正确转义任何特殊字符,这可能会中断查询。

从 2.0.6 版本开始,你可以使用 \yii\sphinx\MatchExpression 来针对 ‘MATCH’ 语句组合。它允许 ‘MATCH’ 表达式组合,以类似的方式使用占位符作为绑定参数,那些值会通过使用 \yii\sphinx\Connection::escapeMatchValue() 自动转义。例如:

use yii\sphinx\Query;
use yii\sphinx\MatchExpression;

$rows = (new Query())
    ->match(new MatchExpression('@title :title', ['title' => 'Yii'])) // value of ':title' will be escaped automatically
    ->all();

使用活动记录

这个扩展提供的活动记录解决方案与 \yii\db\ActiveRecord 类似。要声明一个活动记录类,你需要继承 \yii\sphinx\ActiveRecord 并实现 indexName 方法:

use yii\sphinx\ActiveRecord;

class Article extends ActiveRecord
{
    /**
     * @return string the name of the index associated with this ActiveRecord class.
     */
    public static function indexName()
    {
        return 'idx_article';
    }
}

您可以使用 [[match()],[[andMatch()]] 和 [[orMatch()] 来组合多个条件。每个条件都可以使用数组语法来指定,类似于使用 \yii\sphinx\Query:where]],例如:

use yii\sphinx\Query;
use yii\sphinx\MatchExpression;

$rows = (new Query())
    ->match(
        // produces '((@title "Yii") (@author "Paul")) | (@content "Sphinx")' :
        (new MatchExpression())
            ->match(['title' => 'Yii'])
            ->andMatch(['author' => 'Paul'])
            ->orMatch(['content' => 'Sphinx'])
    )
    ->all();

你也可以用“MAYBE”、“PROXIMITY”等特殊操作符组合表达式。例如:

use yii\sphinx\Query;
use yii\sphinx\MatchExpression;

$rows = (new Query())
    ->match(
        // produces '@title "Yii" MAYBE "Sphinx"' :
        (new MatchExpression())->match([
            'maybe',
            'title',
            'Yii',
            'Sphinx',
        ])
    )
    ->all();

$rows = (new Query())
    ->match(
        // produces '@title "Yii"~10' :
        (new MatchExpression())->match([
            'proximity',
            'title',
            'Yii',
            10,
        ])
    )
    ->all();

获取查询 META 信息

Sphinx allows fetching statistical information about last performed query via SHOW META SphinxQL statement. This information is commonly used to get total count of rows in the index without extra SELECT COUNT(*) ... query. Although you can always run such query manually, yii\sphinx\Query allows you to do this automatically without extra efforts. All you need to do is enable yii\sphinx\Query::showMeta and use yii\sphinx\Query::search() to fetch all rows and meta information:

$query = new Query();
$results = $query->from('idx_item')
    ->match('foo')
    ->showMeta(true) // enable automatic 'SHOW META' query
    ->search(); // retrieve all rows and META information

$items = $results['hits'];
$meta = $results['meta'];
$totalItemCount = $results['meta']['total'];

Note: Total item count that can be extracted from ‘meta’ is limited to max_matches sphinx option. If your index contains more records than max_matches value (usually - 1000), you should either raise up max_matches via [[Query::options]] or use [[Query::count()]] to retrieve records count.

FACET 子句

Since version 2.2.3 Sphinx provides ability of the facet searching via FACET clause:

SELECT * FROM idx_item FACET brand_id FACET categories;

yii\sphinx\Query supports composition of this clause as well as fetching facet results. You may specify facets via yii\sphinx\Query::facets. In order to fetch results with facets you need to use yii\sphinx\Query::search() method. For example:

use yii\sphinx\Query;

$query = new Query();
$results = $query->from('idx_item')
    ->facets([
        'brand_id',
        'categories',
    ])
    ->search($connection); // retrieve all rows and facets

$items = $results['hits'];
$facets = $results['facets'];

foreach ($results['facets']['brand_id'] as $frame) {
    $brandId = $frame['value'];
    $count = $frame['count'];
    ...
}

Note: make sure you are using Sphinx server version 2.2.3 or higher before attempting to use facet feature.

You may specify additional facet options like select or order using an array format:

use yii\db\Expression;
use yii\sphinx\Query;

$query = new Query();
$results = $query->from('idx_item')
    ->facets([
        'price' => [
            'select' => 'INTERVAL(price,200,400,600,800) AS price', // using function
            'order' => ['FACET()' => SORT_ASC],
        ],
        'name_in_json' => [
            'select' => [new Expression('json_attr.name AS name_in_json')], // have to use `Expression` to avoid unnecessary quoting
        ],
    ])
    ->search($connection);

Note: if you specify a custom select for a facet, ensure the facet name has the corresponding column inside the select statement. For example, if you have specified a facet named ‘my_facet’, its select statement should contain ‘my_facet’ attribute or an expression aliased as ‘my_facet’ (‘expr() AS my_facet’).

使用 data providers

You can use [[\yii\data\ActiveDataProvider]] with the [[\yii\sphinx\Query]] and [[\yii\sphinx\ActiveQuery]]:

use yii\data\ActiveDataProvider;
use yii\sphinx\Query;

$query = new Query();
$query->from('yii2_test_article_index')->match('development');
$provider = new ActiveDataProvider([
    'query' => $query,
    'pagination' => [
        'pageSize' => 10,
    ]
]);
$models = $provider->getModels();
use yii\data\ActiveDataProvider;
use app\models\Article;

$provider = new ActiveDataProvider([
    'query' => Article::find(),
    'pagination' => [
        'pageSize' => 10,
    ]
]);
$models = $provider->getModels();

However, if you want to use ‘facet’ feature or query meta information benefit you need to use yii\sphinx\ActiveDataProvider. It allows preparing total item count using query ‘meta’ information and fetching of the facet results:

use yii\sphinx\ActiveDataProvider;
use yii\sphinx\Query;

$query = new Query();
$query->from('idx_item')
    ->match('foo')
    ->showMeta(true)
    ->facets([
        'brand_id',
        'categories',
    ]);
$provider = new ActiveDataProvider([
    'query' => $query,
    'pagination' => [
        'pageSize' => 10,
    ]
]);
$models = $provider->getModels();
$facets = $provider->getFacets();
$brandIdFacet = $provider->getFacet('brand_id');

Note: Because pagination offset and limit may exceed Sphinx ‘max_matches’ bounds, data provider will set ‘max_matches’ option automatically based on those values. However, if [[Query::showMeta]] is set, such adjustment is not performed as it will break total count calculation, so you’ll have to deal with ‘max_matches’ bounds on your own.

构造 Snippets (节选)

Snippet (excerpt) - is a fragment of the index source text, which contains highlighted words from fulltext search condition. Sphinx has a powerful build-in mechanism to compose snippets. However, since Sphinx does not store the original indexed text, the snippets for the rows in query result should be build separately via another query. Such query may be performed via yii\sphinx\Command::callSnippets():

$sql = "SELECT * FROM idx_item WHERE MATCH('about')";
$rows = Yii::$app->sphinx->createCommand($sql)->queryAll();

$rowSnippetSources = [];
foreach ($rows as $row) {
    $rowSnippetSources[] = file_get_contents('/path/to/index/files/' . $row['id'] . '.txt');
}

$snippets = Yii::$app->sphinx->createCommand($sql)->callSnippets('idx_item', $rowSnippetSources, 'about');

You can simplify this workflow using [[yii\sphinx\Query::snippetCallback]]. It is a PHP callback, which receives array of query result rows as an argument and must return the array of snippet source strings in the order, which match one of incoming rows. Example:

use yii\sphinx\Query;

$query = new Query();
$rows = $query->from('idx_item')
    ->match($_POST['search'])
    ->snippetCallback(function ($rows) {
        $result = [];
        foreach ($rows as $row) {
            $result[] = file_get_contents('/path/to/index/files/' . $row['id'] . '.txt');
        }
        return $result;
    })
    ->all();

foreach ($rows as $row) {
    echo $row['snippet'];
}

If you are using Active Record, you can [[yii\sphinx\ActiveQuery::snippetByModel()]] to compose a snippets. This method retrieves snippet source per each row calling getSnippetSource() method of the result model. All you need to do is implement it in your Active Record class, so it return the correct value:

use yii\sphinx\ActiveRecord;

class Article extends ActiveRecord
{
    public function getSnippetSource()
    {
        return file_get_contents('/path/to/source/files/' . $this->id . '.txt');;
    }
}

$articles = Article::find()->snippetByModel()->all();

foreach ($articles as $article) {
    echo $article->snippet;
}

使用 Gii 生成器

This extension provides a code generator, which can be integrated with yii ‘gii’ module. It allows generation of the Active Record code. In order to enable it, you should adjust your application configuration in following way:

return [
    //....
    'modules' => [
        // ...
        'gii' => [
            'class' => 'yii\gii\Module',
            'generators' => [
                'sphinxModel' => [
                    'class' => 'yii\sphinx\gii\model\Generator'
                ]
            ],
        ],
    ]
];

浮点型参数绑定

There are issue related to float values binding using PDO and SphinxQL. PDO does not provide a way to bind a float parameter in prepared statement mode, thus float values are passed with mode PDO::PARAM_STR and thus are bound to the statement as quoted strings, e.g. '9.85'. Unfortunally, SphinxQL is unable to recognize float values passed in this way, producing following error:

syntax error, unexpected QUOTED_STRING, expecting CONST_INT or CONST_FLOAT

In order to bypass this problem any parameter bind to the [[\yii\sphinx\Command]], which PHP type is exactly ‘float’, will be inserted to the SphinxQL content as literal instead of being bound.

This feature works only if value is a native PHP float (strings containing floats do not work). For example:

use yii\sphinx\Query;

// following code works fine:
$rows = (new Query())->from('item_index')
    ->where('price > :price AND price < :priceMax', [
        'price' => 2.1,
        'priceMax' => 2.9,
    ])
    ->all();

// this one produces an error:
$rows = (new Query())->from('item_index')
    ->where('price > :price AND price < :priceMax', [
        'price' => '2.1',
        'priceMax' => '2.9',
    ])
    ->all();

However, in case you are using ‘hash’ conditions over the index fields declared as ‘float’, the type conversion will be performed automatically:

use yii\sphinx\Query;

// following code works fine in case 'price' is a float field in 'item_index':
$rows = (new Query())->from('item_index')
    ->where([
        'price' => '2.5'
    ])
    ->all();

Note: it could be by the time you are reading this float param binding is already fixed at Sphinx server side, or there are other concerns about this functionality, which make it undesirable. In this case you can disable automatic float params conversion via [[\yii\sphinx\Connection::enableFloatConversion]].

使用分布式索引

This extension uses DESCRIBE query in order to fetch information about Sphinx index structure (field names and types). However for the distributed indexes it is not always possible. Schema of such index can be found only, if its declaration contains at list one available local index. For example:

index item_distributed
{
    type = distributed

    # local index :
    local = item_local

    # remote indexes :
    agent = 127.0.0.1:9312:remote_item_1
    agent = 127.0.0.1:9313:remote_item_2
    # ...
}

It is recommended to have at least one local index in the distributed index declaration. You are not forced to actually use it - this local index may be empty, it is needed for the schema declaration only.

Still it is allowed to specify distributed index without local one. For such index the default dummy schema will be used. However in this case automatic typecasting for the index fields will be unavailable and you should perform data typecast on your own. For example:

use yii\sphinx\Query;

// distributed index with local
$rows = (new Query())->from('item_distributed_with_local')
    ->where(['category_id' => '12']) // works fine string `'12'` - converted to integer `12`
    ->all();

// distributed index without local
$rows = (new Query())->from('item_distributed_without_local')
    ->where(['category_id' => '12']) // produces SphinxQL error: 'syntax error, unexpected QUOTED_STRING, expecting CONST_INT'
    ->all();

$rows = (new Query())->from('item_distributed_without_local')
    ->where(['category_id' => (int)'12']) // need to perform typecasting
    ->all();

💖喜欢本文档的,欢迎点赞、收藏、留言或转发,谢谢支持!
作者邮箱:zhuzixian520@126.com,github地址:github.com/zhuzixian520

本文章首发在 LearnKu.com 网站上。

上一篇 下一篇
zhuzixian520
讨论数量: 0
发起讨论 只看当前版本


暂无话题~