I am using this line to generate a sha1 id for node.js:

```
crypto.createHash('sha1').digest('hex');
```

The problem is that it's returning the same id every time.

Is it possible to have it generate a random id each time so I can use it as a database document id?

Have a look here: How do I use node.js Crypto to create a HMAC-SHA1 hash? I'd create a hash of the current timestamp + a random number to ensure hash uniqueness:

```
var current_date = (new Date()).valueOf().toString();
var random = Math.random().toString();
crypto.createHash('sha1').update(current_date + random).digest('hex');
```

I'd recommend using crypto.randomBytes. It's not `sha1`

, but for id purposes, it's quicker, and just as "random".

```
var id = crypto.randomBytes(20).toString('hex');
//=> f26d60305dae929ef8640a75e70dd78ab809cfe9
```

The resulting string will be twice as long as the random bytes you generate; each byte encoded to hex is 2 characters. 20 bytes will be 40 characters of hex.

Using 20 bytes, we have `256^20`

or **1,461,501,637,330,902,918,203,684,832,716,283,019,655,932,542,976** unique output values. This is **identical** to SHA1's 160-bit (20-byte) possible outputs.

Knowing this, it's not really meaningful for us to `shasum`

our random bytes. It's like rolling a die twice but only accepting the second roll; no matter what, you have 6 possible outcomes each roll, so the first roll is sufficient.

**Why is this better?**

To understand why this is better, we first have to understand how hashing functions work. Hashing functions (including SHA1) will always generate the same output if the same input is given.

Say we want to generate IDs but our random input is generated by a coin toss. We have `"heads"`

or `"tails"`

```
% echo -n "heads" | shasum
c25dda249cdece9d908cc33adcd16aa05e20290f -
% echo -n "tails" | shasum
71ac9eed6a76a285ae035fe84a251d56ae9485a4 -
```

If `"heads"`

comes up again, the SHA1 output will be the **same** as it was the first time

```
% echo -n "heads" | shasum
c25dda249cdece9d908cc33adcd16aa05e20290f -
```

Ok, so a coin toss is not a great random ID generator because we only have 2 possible outputs.

If we use a standard 6-sided die, we have 6 possible inputs. Guess how many possible SHA1 outputs? 6!

```
input => (sha1) => output
1 => 356a192b7913b04c54574d18c28d46e6395428ab
2 => da4b9237bacccdf19c0760cab7aec4a8359010b0
3 => 77de68daecd823babbb58edb1c8e14d7106e83bb
4 => 1b6453892473a467d07372d45eb05abc2031647a
5 => ac3478d69a3c81fa62e60f5c3696165a4e5e6ac4
6 => c1dfd96eea8cc2b62785275bca38ac261256e278
```

We both agree that a coin toss or a 6-sided die would make a bad random id generator, because our possible SHA1 results (the value we use for the ID) are very few. But what if we use something that has a lot more outputs? Like a timestamp with milliseconds? Or JavaScript's `Math.random`

? Or even a *combination* of those two?!

Let's compute just how many unique ids we would get ...

**The uniqueness of a timestamp with milliseconds**

When using `(new Date()).valueOf().toString()`

, you're getting a 13-character number (e.g., `1375369309741`

). However, since this a sequentially updating number (once per millisecond), the outputs are almost always the same. Let's take a look

```
for (var i=0; i<10; i++) {
console.log((new Date()).valueOf().toString());
}
console.log("OMG so not random");
// 1375369431838
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431840
// 1375369431840
// OMG so not random
```

To be fair, for comparison purposes, **in a given minute** (a generous operation execution time), you will have `60*1000`

or `60000`

uniques.

**The uniqueness of Math.random**

Now, when using `Math.random`

, because of the way JavaScript represents 64-bit floating point numbers, you'll get a number with length anywhere between 13 and 24 characters long. A longer result means more digits which means more entropy. First, we need to find out which is the most probable length.

The script below will determine which length is most probable. We do this by generating 1 million random numbers and incrementing a counter based on the `.length`

of each number.

```
// get distribution
var counts = [], rand, len;
for (var i=0; i<1000000; i++) {
rand = Math.random();
len = String(rand).length;
if (counts[len] === undefined) counts[len] = 0;
counts[len] += 1;
}
// calculate % frequency
var freq = counts.map(function(n) { return n/1000000 *100 });
```

By dividing each counter by 1 million, we get the probability of the length of number returned from `Math.random`

.

```
len frequency(%)
------------------
13 0.0004
14 0.0066
15 0.0654
16 0.6768
17 6.6703
18 61.133 <- highest probability
19 28.089 <- second highest probability
20 3.0287
21 0.2989
22 0.0262
23 0.0040
24 0.0004
```

So, even though it's not entirely true, let's be generous and say you get a 19-character-long random output; `0.1234567890123456789`

. The first characters will always be `0`

and `.`

, so really we're only getting 17 random characters. This leaves us with `10^17`

`+1`

(for possible `0`

; see notes below) or **100,000,000,000,000,001** uniques.

**So how many random inputs can we generate?**

Ok, we calculated the number of results for a millisecond timestamp and `Math.random`

```
100,000,000,000,000,001 (Math.random)
* 60,000 (timestamp)
-----------------------------
6,000,000,000,000,000,060,000
```

That's a single 6,000,000,000,000,000,060,000-sided die. Or, to make this number more humanly digestible, this is **roughly** the same number as

```
input outputs
------------------------------------------------------------------------------
( 1×) 6,000,000,000,000,000,060,000-sided die 6,000,000,000,000,000,060,000
(28×) 6-sided die 6,140,942,214,464,815,497,21
(72×) 2-sided coins 4,722,366,482,869,645,213,696
```

Sounds pretty good, right ? Well, let's find out ...

SHA1 produces a 20-byte value, with a possible 256^20 outcomes. So we're really not using SHA1 to it's full potential. Well how much are we using?

```
node> 6000000000000000060000 / Math.pow(256,20) * 100
```

```
generator sha1 potential used
-----------------------------------------------------------------------------
crypto.randomBytes(20) 100%
Date() + Math.random() 0.00000000000000000000000000411%
6-sided die 0.000000000000000000000000000000000000000000000411%
A coin 0.000000000000000000000000000000000000000000000137%
```

Holy cats, man! Look at all those zeroes. So how much better is `crypto.randomBytes(20)`

? **243,583,606,221,817,150,598,111,409** times better.

**Notes about the +1 and frequency of zeroes**

If you're wondering about the `+1`

, it's possible for `Math.random`

to return a `0`

which means there's 1 more possible unique result we have to account for.

Based on the discussion that happened below, I was curious about the frequency a `0`

would come up. Here's a little script, `random_zero.js`

, I made to get some data

```
#!/usr/bin/env node
var count = 0;
while (Math.random() !== 0) count++;
console.log(count);
```

Then, I ran it in 4 threads (I have a 4-core processor), appending the output to a file

```
$ yes | xargs -n 1 -P 4 node random_zero.js >> zeroes.txt
```

So it turns out that a `0`

is not that hard to get. After 100 values were recorded, the average was

1 in

3,164,854,823randoms is a 0

Cool! More research would be required to know if that number is on-par with a uniform distribution of v8's `Math.random`

implementation

Licensed under: CC-BY-SA with attribution

Not affiliated with: Stack Overflow