Ultra Fast URL Lookup Engine
Specialized Ultra Fast URL Lookup engine (UFULe) stores URLs (hashed), their priority (up to 256), their categories (up to 32), type of URL ,special URL information. Moreover UFULe stores a list of requester IP and requester ID to take these two parameters into account when deciding about a URL and more complex policies could be set using these two factors.
UFULe incorporates a fast URL canonicalizer which supports both Unicode domains and paths (ISO-8859-1) and also special valid URLs that does not follow the cited RFC.
UFULe stores all the mentioned data for 100 millions of URLs only in 1.5 GB. It is possible to have up to 200,000 queries per second on the stored data using UFULe . Based on URL parent flags, for each query the URL would be normalized and broken to all its sub-domains, sub-paths and sub-query. Moreover following could be checked for each URL if specified:
-
URL in URL (both plain and encoded)
-
Keyword in Domain
-
Keyword in paths or queries
UFULe package is consist of an SDK and an update server. The SDK is provided as an SDK to be deployed and use in web filters. And the update server converts URLs store in standard DBMS to light patch files. These light patch files are hashed version of CUDB and are being used in the SDK. The update server is responsible to update these patch files automatically. The updating process happens at run-time. The update server could also be hosted on a cloud architecture.