Updated answer to Try It Yourself Ch. 14, problem 2
This commit is contained in:
parent
b35f9dff17
commit
712cec0d31
@ -899,26 +899,27 @@ SELECT (regexp_match('Williams, Sr.', '.*, (.*)'))[1];
|
|||||||
-- regexp_split_to_table() in a subquery to create a table of words to count.
|
-- regexp_split_to_table() in a subquery to create a table of words to count.
|
||||||
-- Bonus: remove commas and periods at the end of each word.
|
-- Bonus: remove commas and periods at the end of each word.
|
||||||
|
|
||||||
-- Answer:
|
|
||||||
-- This query uses a Common Table Expression to first separate each word
|
-- This query uses a Common Table Expression to first separate each word
|
||||||
-- in the text into a separate row in a table named word_list. Then the SELECT
|
-- in the text into a separate row in a table named word_list. As part of the
|
||||||
-- statement counts the words, which are cleaned up with two operations. First,
|
-- operation, we use nested replace functions to remove commas, periods, and
|
||||||
-- several nested replace functions remove commas, periods, and colons. Second,
|
-- colons. Then the SELECT statement counts the words, which we convert
|
||||||
-- all words are converted to lowercase so that when we count we group words
|
-- to lowercase so that when we count we group words that may appear with
|
||||||
-- that may appear with various cases (e.g., "Military" and "military").
|
-- various cases (e.g., "Military" and "military").
|
||||||
|
|
||||||
|
-- Thanks to reader @DavidSwagger for pointing out the replace function
|
||||||
|
-- needed to happen in the WITH statement.
|
||||||
|
|
||||||
WITH
|
WITH
|
||||||
word_list (word)
|
word_list (word)
|
||||||
AS
|
AS
|
||||||
(
|
(
|
||||||
SELECT regexp_split_to_table(speech_text, '\s') AS word
|
SELECT regexp_split_to_table(
|
||||||
|
replace(replace(replace(speech_text, ',', ''), '.', ''), ':', ''),
|
||||||
|
'\s') AS word
|
||||||
FROM president_speeches
|
FROM president_speeches
|
||||||
WHERE speech_date = '1946-01-21'
|
WHERE speech_date = '1946-01-21'
|
||||||
)
|
)
|
||||||
|
SELECT lower(word) AS cleaned_word,
|
||||||
SELECT lower(
|
|
||||||
replace(replace(replace(word, ',', ''), '.', ''), ':', '')
|
|
||||||
) AS cleaned_word,
|
|
||||||
count(*)
|
count(*)
|
||||||
FROM word_list
|
FROM word_list
|
||||||
WHERE length(word) >= 5
|
WHERE length(word) >= 5
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user