I have this table in SQL:
CREATE TABLE mytable (
id BIGINT,
color VARCHAR(10),
food VARCHAR(10),
sport VARCHAR(12),
animal VARCHAR(10)
) DISTRIBUTE ON (id);
INSERT INTO mytable (id, color, food, sport, animal) VALUES
( 1, 'red', 'pizza', 'soccer', 'dog'),
( 2, 'blue', 'sushi', 'tennis', 'cat'),
( 3, 'green', 'tacos', 'basketball', 'horse'),
( 4, 'yellow', 'pasta', 'golf', 'rabbit'),
( 5, 'pink', 'pizza', 'swimming', 'fox'),
( 6, 'orange', 'burger', 'soccer', 'bear'),
( 7, 'black', 'ramen', 'hockey', 'wolf'),
( 8, 'purple', 'tacos', 'cricket', 'cat'),
( 9, 'white', 'salad', 'tennis', 'dog'),
(10, 'brown', 'pizza', 'running', 'deer'),
(11, 'cyan', 'curry', 'boxing', 'tiger'),
(12, 'gray', 'sushi', 'soccer', 'horse'),
(13, 'teal', 'burger', 'baseball', 'owl'),
(14, 'maroon', 'pasta', 'golf', 'dog'),
(15, 'navy', 'tacos', 'rugby', 'frog');
I want to randomly select 15 rows from this table (i.e. sampling without replacement).
The final result should look something like this:
resample_id | original_id | color | food | sport | animal
------------+-------------+--------+--------+------------+--------
1 | 7 | black | ramen | hockey | wolf
2 | 3 | green | tacos | basketball | horse
3 | 7 | black | ramen | hockey | wolf
4 | 12 | gray | sushi | soccer | horse
5 | 1 | red | pizza | soccer | dog
6 | 15 | navy | tacos | rugby | frog
7 | 7 | black | ramen | hockey | wolf
8 | 9 | white | salad | tennis | dog
9 | 3 | green | tacos | basketball | horse
10 | 14 | maroon | pasta | golf | dog
11 | 2 | blue | sushi | tennis | cat
12 | 12 | gray | sushi | soccer | horse
13 | 5 | pink | pizza | swimming | fox
14 | 10 | brown | pizza | running | deer
15 | 5 | pink | pizza | swimming | fox
I thought of the following logic - generate random numbers corresponding to the rows that need to be selected and then join them back to the original table:
SELECT
g.i AS resample_id,
t.id AS original_id,
t.color, t.food,
t.sport,
t.animal
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY NULL) AS i,
CAST((RANDOM() * 15) AS BIGINT) + 1 AS pick_id
FROM mytable
LIMIT 15
) g
JOIN mytable t ON t.id = g.pick_id
ORDER BY g.i;
This is giving me an empty response.
Can someone please help me figure out what I am doing wrong?