unsuccessfully for a solution to what I believe is a very common
problem. I know it's easy to do wildcard match against data in DB
(using LIKE and "%" and "?").
But is it possible to match a concrete string against a database of
wildcarded data? ("%" and LIKE do not work). For example:
CREATE TABLE blacklist (
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
pattern VARCHAR(255) NOT NULL
/* ... */
);
INSERT INTO blacklist (pattern) VALUES ('%foobar.com');
Is there a select query that would match address mars2.foobar.com
against this row?
Some people claim that the following query will work. I have tried it
and it's not true on either Oracle or SQL server.
SELECT * FROM blacklist WHERE 'mars2.foobar.com' LIKE pattern;
Some people suggest breaking up the blacklist table into N varchar
fields for each domain segment and then representing a wildcard
character as a NULL and use isNull to match it. This does work to an
extent. However, a) it seems really ugly, b) does not allow arbitrary
wildcarding (eg %mars%foobar.com), and c) this is something the DB
should do out of the box.
Please help! Humanity will be greatful as there's currently no
solution to this anywhere on newsgroups."Robert Brown" <robertbrown1971@.yahoo.com> wrote in message
news:240a4d09.0404301119.467bd1e1@.posting.google.c om...
> I have researched newsgroups and the web very thoroughly and
> unsuccessfully for a solution to what I believe is a very common
> problem. I know it's easy to do wildcard match against data in DB
> (using LIKE and "%" and "?").
> But is it possible to match a concrete string against a database of
> wildcarded data? ("%" and LIKE do not work). For example:
>
> CREATE TABLE blacklist (
> id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
> pattern VARCHAR(255) NOT NULL
> /* ... */
> );
> INSERT INTO blacklist (pattern) VALUES ('%foobar.com');
> Is there a select query that would match address mars2.foobar.com
> against this row?
> Some people claim that the following query will work. I have tried it
> and it's not true on either Oracle or SQL server.
> SELECT * FROM blacklist WHERE 'mars2.foobar.com' LIKE pattern;
> Some people suggest breaking up the blacklist table into N varchar
> fields for each domain segment and then representing a wildcard
> character as a NULL and use isNull to match it. This does work to an
> extent. However, a) it seems really ugly, b) does not allow arbitrary
> wildcarding (eg %mars%foobar.com), and c) this is something the DB
> should do out of the box.
> Please help! Humanity will be greatful as there's currently no
> solution to this anywhere on newsgroups.
Works fine for me on Oracle:
C:\>sqlplus
Connected to Oracle9i Enterprise Edition Release 9.2.0.1.0
SQL> create table blacklist (pattern varchar2(255));
Table created
SQL> insert into blacklist (pattern) values ('%foobar.com');
1 row inserted
SQL> select * from blacklist where 'mars2.foobar.com' like pattern;
PATTERN
-----------------------
--
%foobar.com|||David Best (davebest@.usa.net) writes:
> Works fine for me on Oracle:
> C:\>sqlplus
> Connected to Oracle9i Enterprise Edition Release 9.2.0.1.0
> SQL> create table blacklist (pattern varchar2(255));
> Table created
> SQL> insert into blacklist (pattern) values ('%foobar.com');
> 1 row inserted
> SQL> select * from blacklist where 'mars2.foobar.com' like pattern;
And the same example (save the funny varchar2) works on MS SQL Server too.
And should work on about any DBMS, as this is core SQL.
--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se
Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||"Erland Sommarskog" <sommar@.algonet.se> wrote in message
news:Xns94DC31A1B00CYazorman@.127.0.0.1...
> David Best (davebest@.usa.net) writes:
> > Works fine for me on Oracle:
> > C:\>sqlplus
> > Connected to Oracle9i Enterprise Edition Release 9.2.0.1.0
> > SQL> create table blacklist (pattern varchar2(255));
> > Table created
> > SQL> insert into blacklist (pattern) values ('%foobar.com');
> > 1 row inserted
> > SQL> select * from blacklist where 'mars2.foobar.com' like pattern;
> And the same example (save the funny varchar2) works on MS SQL Server too.
> And should work on about any DBMS, as this is core SQL.
> --
> Erland Sommarskog, SQL Server MVP, sommar@.algonet.se
> Books Online for SQL Server SP3 at
> http://www.microsoft.com/sql/techin.../2000/books.asp
It should work, but the query is going to be very inefficient.
Jim|||Jim Kennedy (kennedy-downwithspammersfamily@.attbi.net) writes:
> It should work, but the query is going to be very inefficient.
Yes, if there is an index on pattern it is not going to be useful,
since the match is at the end of the string. But that is not really
the same that it is very ineffecient. If you have a million entries,
you will certainly notice the toll. But with thousand? Not very much.
And thousand is a more likely number than a million.
For this particular case there exists a possible way to speed things up.
Since we search for the end of the string, you could have:
CREATE TABLE blacklist (pattern varchar(225) NOT NULL PRIMARY KEY,
revpattern AS reverse(pattern));
CREATE UNIQUE INDEX revix ON blacklist (revpattern);
go
INSERT blacklist VALUES ('%@.example.com')
go
SELECT pattern FROM blacklist
WHERE reverse('spammer@.example.com') LIKE revpattern
However computed columns is not standard SQL, and may not work on all
DBMSs. The above works in SQL Server.
--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se
Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||The query will still need a table (or index) scan, because the column is
to the right of the LIKE operator...
Gert-Jan
Erland Sommarskog wrote:
> Jim Kennedy (kennedy-downwithspammersfamily@.attbi.net) writes:
> > It should work, but the query is going to be very inefficient.
> Yes, if there is an index on pattern it is not going to be useful,
> since the match is at the end of the string. But that is not really
> the same that it is very ineffecient. If you have a million entries,
> you will certainly notice the toll. But with thousand? Not very much.
> And thousand is a more likely number than a million.
> For this particular case there exists a possible way to speed things up.
> Since we search for the end of the string, you could have:
> CREATE TABLE blacklist (pattern varchar(225) NOT NULL PRIMARY KEY,
> revpattern AS reverse(pattern));
> CREATE UNIQUE INDEX revix ON blacklist (revpattern);
> go
> INSERT blacklist VALUES ('%@.example.com')
> go
> SELECT pattern FROM blacklist
> WHERE reverse('spammer@.example.com') LIKE revpattern
> However computed columns is not standard SQL, and may not work on all
> DBMSs. The above works in SQL Server.
> --
> Erland Sommarskog, SQL Server MVP, sommar@.algonet.se
> Books Online for SQL Server SP3 at
> http://www.microsoft.com/sql/techin.../2000/books.asp
--
(Please reply only to the newsgroup)
No comments:
Post a Comment