' : ' <'). ($search['fields'][$key]['inc'] ? '=' : ''). " '".$search['fields'][$key]['date']."'"; } } } if (isset($search['fields']['cat'])) { if ($search['fields']['cat']['sub_inc']) { // searching all the categories id of sub-categories $cat_ids = get_subcat_ids($search['fields']['cat']['words']); } else { $cat_ids = $search['fields']['cat']['words']; } $local_clause = 'category_id IN ('.implode(',', $cat_ids).')'; $clauses[] = $local_clause; } // adds brackets around where clauses $clauses = prepend_append_array_items($clauses, '(', ')'); $where_separator = implode( "\n ".$search['mode'].' ', $clauses ); $search_clause = $where_separator; return $search_clause; } /** * Returns the list of items corresponding to the advanced search array. * * @param array $search * @param string $images_where optional additional restriction on images table * @return array */ function get_regular_search_results($search, $images_where='') { global $conf; $forbidden = get_sql_condition_FandF( array ( 'forbidden_categories' => 'category_id', 'visible_categories' => 'category_id', 'visible_images' => 'id' ), "\n AND" ); $items = array(); $tag_items = array(); if (isset($search['fields']['tags'])) { $tag_items = get_image_ids_for_tags( $search['fields']['tags']['words'], $search['fields']['tags']['mode'] ); } $search_clause = get_sql_search_clause($search); if (!empty($search_clause)) { $query = ' SELECT DISTINCT(id) FROM '.IMAGES_TABLE.' i INNER JOIN '.IMAGE_CATEGORY_TABLE.' AS ic ON id = ic.image_id WHERE '.$search_clause; if (!empty($images_where)) { $query .= "\n AND ".$images_where; } $query .= $forbidden.' '.$conf['order_by']; $items = array_from_query($query, 'id'); } if ( !empty($tag_items) ) { switch ($search['mode']) { case 'AND': if (empty($search_clause)) { $items = $tag_items; } else { $items = array_values( array_intersect($items, $tag_items) ); } break; case 'OR': $before_count = count($items); $items = array_unique( array_merge( $items, $tag_items ) ); break; } } return $items; } /** * Finds if a char is a letter, a figure or any char of the extended ASCII table (>127). * * @param char $ch * @return bool */ function is_word_char($ch) { return ($ch>='0' && $ch<='9') || ($ch>='a' && $ch<='z') || ($ch>='A' && $ch<='Z') || ord($ch)>127; } /** * Finds if a char is a special token for word start: [{<=*+ * * @param char $ch * @return bool */ function is_odd_wbreak_begin($ch) { return strpos('[{<=*+', $ch)===false ? false:true; } /** * Finds if a char is a special token for word end: ]}>=*+ * * @param char $ch * @return bool */ function is_odd_wbreak_end($ch) { return strpos(']}>=*+', $ch)===false ? false:true; } define('QST_QUOTED', 0x01); define('QST_NOT', 0x02); define('QST_OR', 0x04); define('QST_WILDCARD_BEGIN', 0x08); define('QST_WILDCARD_END', 0x10); define('QST_WILDCARD', QST_WILDCARD_BEGIN|QST_WILDCARD_END); /** * Analyzes and splits the quick/query search query $q into tokens. * q='john bill' => 2 tokens 'john' 'bill' * Special characters for MySql full text search (+,<,>,~) appear in the token modifiers. * The query can contain a phrase: 'Pierre "New York"' will return 'pierre' qnd 'new york'. * * @param string $q */ class QSingleToken { var $is_single = true; var $token; var $idx; function __construct($token) { $this->token = $token; } } class QMultiToken { var $is_single = false; var $tokens = array(); var $token_modifiers = array(); function __toString() { $s = ''; for ($i=0; $itokens); $i++) { $modifier = $this->token_modifiers[$i]; if ($i) $s .= ' '; if ($modifier & QST_OR) $s .= 'OR '; if ($modifier & QST_NOT) $s .= 'NOT '; if ($modifier & QST_WILDCARD_BEGIN) $s .= '*'; if ($modifier & QST_QUOTED) $s .= '"'; if (! ($this->tokens[$i]->is_single) ) { $s .= '('; $s .= $this->tokens[$i]; $s .= ')'; } else { $s .= $this->tokens[$i]->token; } if ($modifier & QST_QUOTED) $s .= '"'; if ($modifier & QST_WILDCARD_END) $s .= '*'; } return $s; } function push(&$token, &$modifier) { $this->tokens[] = new QSingleToken($token); $this->token_modifiers[] = $modifier; $token = ""; $modifier = 0; } protected function parse_expression($q, &$qi, $level) { $crt_token = ""; $crt_modifier = 0; for ($stop=false; !$stop && $qipush($crt_token, $crt_modifier); $sub = new QMultiToken; $qi++; $sub->parse_expression($q, $qi, $level+1); $this->tokens[] = $sub; $this->token_modifiers[] = $crt_modifier; $crt_modifier = 0; break; case ')': if ($level>0) $stop = true; break; case '"': if (strlen($crt_token)) $this->push($crt_token, $crt_modifier); $crt_modifier |= QST_QUOTED; break; case '-': if (strlen($crt_token)) $crt_token .= $ch; else $crt_modifier |= QST_NOT; break; case '*': if (strlen($crt_token)) $crt_token .= $ch; // wildcard end later else $crt_modifier |= QST_WILDCARD_BEGIN; break; default: if (preg_match('/[\s,.;!\?]+/', $ch)) { // white space if (strlen($crt_token)) $this->push($crt_token, $crt_modifier); $crt_modifier = 0; } else $crt_token .= $ch; break; } } else {// quoted if ($ch=='"') { if ($qi+1 < strlen($q) && $q[$qi+1]=='*') { $crt_modifier |= QST_WILDCARD_END; $ai++; } $this->push($crt_token, $crt_modifier); } else $crt_token .= $ch; } } if (strlen($crt_token)) $this->push($crt_token, $crt_modifier); for ($i=0; $itokens); $i++) { $token = $this->tokens[$i]; $remove = false; if ($token->is_single) { if ( ($this->token_modifiers[$i]&QST_QUOTED)==0 ) { if ('not' == strtolower($token->token)) { if ($i+1 < count($this->tokens)) $this->token_modifiers[$i+1] |= QST_NOT; $token->token = ""; } if ('or' == strtolower($token->token)) { if ($i+1 < count($this->tokens)) $this->token_modifiers[$i+1] |= QST_OR; $token->token = ""; } if ('and' == strtolower($token->token)) { $token->token = ""; } if ( substr($token->token, -1)=='*' ) { $token->token = rtrim($token->token, '*'); $this->token_modifiers[$i] |= QST_WILDCARD_END; } } if (!strlen($token->token)) $remove = true; } else { if (!count($token->tokens)) $remove = true; } if ($remove) { array_splice($this->tokens, $i, 1); array_splice($this->token_modifiers, $i, 1); $i--; } } } } class QExpression extends QMultiToken { var $stokens = array(); var $stoken_modifiers = array(); function __construct($q) { $i = 0; $this->parse_expression($q, $i, 0); //@TODO: manipulate the tree so that 'a OR b c' is the same as 'b c OR a' $this->build_single_tokens($this); } private function build_single_tokens(QMultiToken $expr) { //@TODO: double negation results in no negation in token modifier for ($i=0; $itokens); $i++) { $token = $expr->tokens[$i]; if ($token->is_single) { $token->idx = count($this->stokens); $this->stokens[] = $token->token; $this->stoken_modifiers[] = $expr->token_modifiers[$i]; } else $this->build_single_tokens($token); } } } class QResults { var $all_tags; var $tag_ids; var $tag_iids; var $images_iids; var $iids; } function qsearch_get_images(QExpression $expr, QResults $qsr) { //@TODO: inflections for english / french $qsr->images_iids = array_fill(0, count($expr->tokens), array()); $query_base = 'SELECT id from '.IMAGES_TABLE.' i WHERE '; for ($i=0; $istokens); $i++) { $token = $expr->stokens[$i]; $clauses = array(); $like = addslashes($token); $like = str_replace( array('%','_'), array('\\%','\\_'), $like); // escape LIKE specials %_ $clauses[] = 'CONVERT(file, CHAR) LIKE \'%'.$like.'%\''; if (strlen($token)>3) // default minimum full text index { $ft = $token; if ($expr->stoken_modifiers[$i] & QST_QUOTED) $ft = '"'.$ft.'"'; if ($expr->stoken_modifiers[$i] & QST_WILDCARD_END) $ft .= '*'; $clauses[] = 'MATCH(i.name, i.comment) AGAINST( \''.addslashes($ft).'\' IN BOOLEAN MODE)'; } else { foreach( array('i.name', 'i.comment') as $field) { $clauses[] = $field.' LIKE \''.$like.' %\''; $clauses[] = $field.' LIKE \'% '.$like.'\''; $clauses[] = $field.' LIKE \'% '.$like.' %\''; } } $query = $query_base.'('.implode(' OR ', $clauses).')'; $qsr->images_iids[$i] = query2array($query,null,'id'); } } function qsearch_get_tags(QExpression $expr, QResults $qsr) { $tokens = $expr->stokens; $token_modifiers = $expr->stoken_modifiers; $token_tag_ids = array_fill(0, count($tokens), array() ); $all_tags = array(); $token_tag_scores = $token_tag_ids; $transliterated_tokens = array(); foreach ($tokens as $token) { $transliterated_tokens[] = transliterate($token); } $query = ' SELECT t.*, COUNT(image_id) AS counter FROM '.TAGS_TABLE.' t INNER JOIN '.IMAGE_TAG_TABLE.' ON id=tag_id GROUP BY id'; $result = pwg_query($query); while ($tag = pwg_db_fetch_assoc($result)) { $transliterated_tag = transliterate($tag['name']); // find how this tag matches query tokens for ($i=0; $i 0) { if (! is_word_char($transliterated_tag[$pos-$wbegin_len-1]) ) { $wbegin_char = $transliterated_tag[$pos-$wbegin_len-1]; break; } $wbegin_len++; } // search end of word $wend_len=0; $wend_char=' '; while ($pos+$token_len+$wend_len < strlen($transliterated_tag)) { if (! is_word_char($transliterated_tag[$pos+$token_len+$wend_len]) ) { $wend_char = $transliterated_tag[$pos+$token_len+$wend_len]; break; } $wend_len++; } $this_score = 0; if ( ($token_modifiers[$i]&QST_WILDCARD)==0 ) {// no wildcard begin or end if ($token_len <= 2) {// search for 1 or 2 characters must match exactly to avoid retrieving too much data if ($wbegin_len==0 && $wend_len==0 && !is_odd_wbreak_begin($wbegin_char) && !is_odd_wbreak_end($wend_char) ) $this_score = 1; } elseif ($token_len == 3) { if ($wbegin_len==0) $this_score = $token_len / ($token_len + $wend_len); } else { $this_score = $token_len / ($token_len + 1.1 * $wbegin_len + 0.9 * $wend_len); } } if ($this_score>0) $match = max($match, $this_score ); $pos++; } if ($match) { $tag_id = (int)$tag['id']; $all_tags[$tag_id] = $tag; $token_tag_ids[$i][] = $tag_id; $token_tag_scores[$i][] = $match; } } } // process tags $not_tag_ids = array(); for ($i=0; $i0 && $token_tag_scores[$i][$j] < $token_tag_scores[$i][0]) ) { array_splice($token_tag_scores[$i], $j); array_splice($token_tag_ids[$i], $j); } } else { $tag_id = $token_tag_ids[$i][$j]; $counter += $all_tags[$tag_id]['counter']; if ($counter > 200 && $j>0 && $token_tag_scores[$i][0] > $token_tag_scores[$i][$j] ) {// "many" images in previous tags and starting from this tag is less relevent array_splice($token_tag_ids[$i], $j); array_splice($token_tag_scores[$i], $j); break; } } } if ($is_not) { $not_tag_ids = array_merge($not_tag_ids, $token_tag_ids[$i]); } } $all_tags = array_diff_key($all_tags, array_flip($not_tag_ids)); usort($all_tags, 'tag_alpha_compare'); foreach ( $all_tags as &$tag ) { $tag['name'] = trigger_event('render_tag_name', $tag['name'], $tag); } $qsr->all_tags = $all_tags; $qsr->tag_ids = $token_tag_ids; $qsr->tag_iids = array_fill(0, count($tokens), array() ); for ($i=0; $itag_iids[$i] = query2array($query, null, 'image_id'); } } } function qsearch_eval(QExpression $expr, QResults $qsr, QMultiToken $crt_expr) { $ids = $not_ids = array(); $first = true; for ($i=0; $itokens); $i++) { $current = $crt_expr->tokens[$i]; if ($current->is_single) { $crt_ids = $qsr->iids[$current->idx] = array_unique( array_merge($qsr->images_iids[$current->idx], $qsr->tag_iids[$current->idx]) ); } else $crt_ids = qsearch_eval($expr, $qsr, $current); $modifier = $crt_expr->token_modifiers[$i]; if ($modifier & QST_NOT) $not_ids = array_unique( array_merge($not_ids, $crt_ids)); else { if ($modifier & QST_OR) $ids = array_unique( array_merge($ids, $crt_ids) ); else { if ($current->is_single && empty($crt_ids)) { //@TODO: mark this term as unmatched and tell users //@TODO: if we don't find a term at all, maybe ignore it and produce some results } if ($first) $ids = $crt_ids; else $ids = array_intersect($ids, $crt_ids); $first= false; } } } if (count($not_ids)) $ids = array_diff($ids, $not_ids); return $ids; } /** * Returns the search results corresponding to a quick/query search. * A quick/query search returns many items (search is not strict), but results * are sorted by relevance unless $super_order_by is true. Returns: * array ( * 'items' => array of matching images * 'qs' => array( * 'matching_tags' => array of matching tags * 'matching_cats' => array of matching categories * 'matching_cats_no_images' =>array(99) - matching categories without images * ) * ) * * @param string $q * @param bool $super_order_by * @param string $images_where optional additional restriction on images table * @return array */ function get_quick_search_results($q, $super_order_by, $images_where='') { global $user, $conf; $search_results = array( 'items' => array(), 'qs' => array('q'=>stripslashes($q)), ); $q = trim($q); $expression = new QExpression($q); //var_export($expression); $qsr = new QResults; qsearch_get_tags($expression, $qsr); qsearch_get_images($expression, $qsr); //var_export($qsr->all_tags); $ids = qsearch_eval($expression, $qsr, $expression); $debug[] = "'; $template->append('footer_elements', implode("\n", $debug) ); return $search_results; } $where_clauses = array(); $where_clauses[]='i.id IN ('. implode(',', $ids) . ')'; if (!empty($images_where)) { $where_clauses[]='('.$images_where.')'; } $where_clauses[] = get_sql_condition_FandF( array ( 'forbidden_categories' => 'category_id', 'visible_categories' => 'category_id', 'visible_images' => 'i.id' ), null,true ); $query = ' SELECT DISTINCT(id) FROM '.IMAGES_TABLE.' i INNER JOIN '.IMAGE_CATEGORY_TABLE.' AS ic ON id = ic.image_id WHERE '.implode("\n AND ", $where_clauses)."\n". $conf['order_by']; $ids = query2array($query, null, 'id'); $debug[] = count($ids).' final photo count -->'; $template->append('footer_elements', implode("\n", $debug) ); $search_results['items'] = $ids; return $search_results; } /** * Returns an array of 'items' corresponding to the search id. * It can be either a quick search or a regular search. * * @param int $search_id * @param bool $super_order_by * @param string $images_where optional aditional restriction on images table * @return array */ function get_search_results($search_id, $super_order_by, $images_where='') { $search = get_search_array($search_id); if ( !isset($search['q']) ) { $result['items'] = get_regular_search_results($search, $images_where); return $result; } else { return get_quick_search_results($search['q'], $super_order_by, $images_where); } } ?>