PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Autres forums > Forum Programmation & Conception > comp.lang.cplus > Universal hashing
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
Universal hashing

Réponse
 
LinkBack Outils de la discussion
Vieux 07/04/2008, 14h45   #1
desktop
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Universal hashing

Maybe not specific related to C++ but inspired by the recent discussions
about unordered_set in the upcoming C++x0 release.

When using universal hashing a new hash-function from a family of
hash-functions is used to map a key into the table.

Lets assume that hash-function H1 has been used to map key k1 to the
table and H2 to map k2 to the table.

I would now like to make a lookup on k1. But from the point of view of
lookup how does it know which hash-function to use?


  Réponse avec citation
Vieux 07/04/2008, 16h03   #2
tommy.hinks@gmail.com
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Universal hashing

On Apr 7, 2:45 pm, desktop <asd...@asd.com> wrote:
> Maybe not specific related to C++ but inspired by the recent discussions
> about unordered_set in the upcoming C++x0 release.
>
> When using universal hashing a new hash-function from a family of
> hash-functions is used to map a key into the table.
>
> Lets assume that hash-function H1 has been used to map key k1 to the
> table and H2 to map k2 to the table.
>
> I would now like to make a lookup on k1. But from the point of view of
> lookup how does it know which hash-function to use?


Presumably this hash-function would be a template parameter to the
std::unordered_set, and would therefore be the same for all elements
in the set. Using different hash-function when inserting elements
defeats the purpose of hashing. The way that today's std::set<> and
std::map<> work are by specifying a predicate operator for equivalence
testing is very similar to the above argument.

T
  Réponse avec citation
Vieux 07/04/2008, 16h28   #3
desktop
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Universal hashing

tommy.hinks@gmail.com wrote:
> On Apr 7, 2:45 pm, desktop <asd...@asd.com> wrote:
>> Maybe not specific related to C++ but inspired by the recent discussions
>> about unordered_set in the upcoming C++x0 release.
>>
>> When using universal hashing a new hash-function from a family of
>> hash-functions is used to map a key into the table.
>>
>> Lets assume that hash-function H1 has been used to map key k1 to the
>> table and H2 to map k2 to the table.
>>
>> I would now like to make a lookup on k1. But from the point of view of
>> lookup how does it know which hash-function to use?

>
> Presumably this hash-function would be a template parameter to the
> std::unordered_set, and would therefore be the same for all elements
> in the set. Using different hash-function when inserting elements
> defeats the purpose of hashing. The way that today's std::set<> and
> std::map<> work are by specifying a predicate operator for equivalence
> testing is very similar to the above argument.
>
> T


Universal hashing as I understand is used to avoid that all elements (in
theory possible) hash to the same "bucket". Its a method to distribute
the keys as even as possible.

This means that you don't provide a single hash-function as a template
parameter but a family of hash-functions. But I still don't see how this
can work when you make a lookup.
  Réponse avec citation
Vieux 07/04/2008, 16h29   #4
outlaw
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Universal hashing

> Lets assume that hash-function H1 has been used to map key k1 to the
> table and H2 to map k2 to the table.
>
> I would now like to make a lookup on k1. But from the point of view of
> lookup how does it know which hash-function to use?



Read the TR1 specs here: http://www.open-std.org/jtc1/sc22/wg...2005/n1836.pdf
Specifically page 98 has the unordered_set<> header:


namespace std {
namespace tr1 {
// [6.3.4.3] Class template unordered_set
template <class Value,
class Hash = hash<Value>,
class Pred = std::equal_to<Value>,
class Alloc = std::allocator<Value> >
class unordered_set;

section 6.3.1 describes the hash as

3.
Each unordered associative container is parameterized by Key, by a
function object Hash that acts as a hash function
for values of type Key, and by a binary predicate Pred that induces an
equivalence relation on values of type Key.
Additionally, unordered_map and unordered_multimap associate an
arbitrary mapped type T with the Key.

4.
A hash function is a function object that takes a single argument of
type Key and returns a value of type std::size_t.

So to answer your question, the container knows at compile time which
hash function to use.

marioc@computer.org
http://securitymario.spaces.live.com/
  Réponse avec citation
Vieux 07/04/2008, 20h24   #5
desktop
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Universal hashing

outlaw wrote:
>> Lets assume that hash-function H1 has been used to map key k1 to the
>> table and H2 to map k2 to the table.
>>
>> I would now like to make a lookup on k1. But from the point of view of
>> lookup how does it know which hash-function to use?

>
>
> Read the TR1 specs here: http://www.open-std.org/jtc1/sc22/wg...2005/n1836.pdf
> Specifically page 98 has the unordered_set<> header:
>
>
> namespace std {
> namespace tr1 {
> // [6.3.4.3] Class template unordered_set
> template <class Value,
> class Hash = hash<Value>,
> class Pred = std::equal_to<Value>,
> class Alloc = std::allocator<Value> >
> class unordered_set;
>
> section 6.3.1 describes the hash as
>
> 3.
> Each unordered associative container is parameterized by Key, by a
> function object Hash that acts as a hash function
> for values of type Key, and by a binary predicate Pred that induces an
> equivalence relation on values of type Key.
> Additionally, unordered_map and unordered_multimap associate an
> arbitrary mapped type T with the Key.
>
> 4.
> A hash function is a function object that takes a single argument of
> type Key and returns a value of type std::size_t.
>
> So to answer your question, the container knows at compile time which
> hash function to use.
>
> marioc@computer.org
> http://securitymario.spaces.live.com/


But thats not very universal. The point with universal hashing is to
randomly select a hash-function each time an operation is made (like
insert). This is described in Introduction to Algorithms in Cormen et. al.

  Réponse avec citation
Vieux 08/04/2008, 10h50   #6
Alan Johnson
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Universal hashing

desktop wrote:
> outlaw wrote:
>>> Lets assume that hash-function H1 has been used to map key k1 to the
>>> table and H2 to map k2 to the table.
>>>
>>> I would now like to make a lookup on k1. But from the point of view of
>>> lookup how does it know which hash-function to use?

>>
>>
>> Read the TR1 specs here:
>> http://www.open-std.org/jtc1/sc22/wg...2005/n1836.pdf
>> Specifically page 98 has the unordered_set<> header:
>>
>>
>> namespace std {
>> namespace tr1 {
>> // [6.3.4.3] Class template unordered_set
>> template <class Value,
>> class Hash = hash<Value>,
>> class Pred = std::equal_to<Value>,
>> class Alloc = std::allocator<Value> >
>> class unordered_set;
>>
>> section 6.3.1 describes the hash as
>>
>> 3.
>> Each unordered associative container is parameterized by Key, by a
>> function object Hash that acts as a hash function
>> for values of type Key, and by a binary predicate Pred that induces an
>> equivalence relation on values of type Key.
>> Additionally, unordered_map and unordered_multimap associate an
>> arbitrary mapped type T with the Key.
>>
>> 4.
>> A hash function is a function object that takes a single argument of
>> type Key and returns a value of type std::size_t.
>>
>> So to answer your question, the container knows at compile time which
>> hash function to use.
>>
>> marioc@computer.org
>> http://securitymario.spaces.live.com/

>
> But thats not very universal. The point with universal hashing is to
> randomly select a hash-function each time an operation is made (like
> insert). This is described in Introduction to Algorithms in Cormen et. al.
>


I think you misunderstand universal hashing.

You don't choose a new hash function for each insertion. Each time you
instantiate a new hash table (e.g. each time the program is run), you
choose a new hash function.

The general idea, in somewhat less than precise terms, is that any given
hash function (even one belonging to a universal class) has a set of
"bad" inputs. However, by choosing a random hash function each time we
create a hash table, we can say that the probability is low that any
particular input will be bad. Choosing from a universal class just lets
us do some math to show what exactly what is mean by "the probability is
low".

As a concrete example, compilers commonly store the symbol table for the
code they are compiling in a hash table. If a deterministic hash
function was used, this would mean there would be some program that
would compile very slowly because the variable names the programmer
chose happen to have a high number of collisions. Choosing a random
hash function, though, means that no such program exists. It is
possible that the hash function that is randomly chosen happens to be
poor for a particular program that is being compiled, but by carefully
selecting the family of hash functions we are choosing from (a universal
class) we limit the probability of that occurring.

--
Alan Johnson
  Réponse avec citation
Vieux 08/04/2008, 14h40   #7
outlaw
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Universal hashing

> You don't choose a new hash function for each insertion. Each time you
> instantiate a new hash table (e.g. each time the program is run), you
> choose a new hash function.


No, but the type has to be known at compile time with C++ generic
programming.
Now, What you do in your datatype at runtime is up to you, if you
wanted to simply "return 1;" from the hash function each time, or
return BCryptGenRandom() each time, so be it.
You'd have a linear time ordered container with the former; and
constant time inserts with the latter, without an easy way of
retrieving it afterwards.

The OP seemed to want to do something akin to
std::tr1::unordered_set<HisObjs, new HisHash<HisObjs> >; which isn't
syntactically correct.
  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 22h48.


Édité par : vBulletin® version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières ©2000-2008
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,18464 seconds with 15 queries