选择一个唯一列,其他可以非唯一。 - MySQL
我有一个临时表,里面有135000行数据,我想把这个临时表中的一些值插入到其他表里。
这是我使用的表结构:
tvtemptable
+-------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+-------+
| PROGTITLE | text | YES | | NULL | |
| SUBTITLE | text | YES | | NULL | |
| EPISODE | text | YES | | NULL | |
| YR | year(4) | YES | | NULL | |
| DIRECTOR | text | YES | | NULL | |
| PERFORMERS | text | YES | | NULL | |
| PREMIERE | tinyint(1) | YES | | NULL | |
| FILM | tinyint(1) | YES | | NULL | |
| RPEAT | tinyint(1) | YES | | NULL | |
| SUBTITLES | tinyint(1) | YES | | NULL | |
| WIDESCREEN | tinyint(1) | YES | | NULL | |
| NEWSERIES | tinyint(1) | YES | | NULL | |
| DEAFSIGNED | tinyint(1) | YES | | NULL | |
| BNW | tinyint(1) | YES | | NULL | |
| STARRATING | tinyint(4) | YES | | NULL | |
| CERTIFICATE | varchar(5) | YES | | NULL | |
| GENRE | varchar(50) | YES | | NULL | |
| DESCRIPTION | text | YES | | NULL | |
| CHOICE | tinyint(1) | YES | | NULL | |
| PROGDATE | date | YES | | NULL | |
| STARTIME | time | YES | | NULL | |
| ENDTIME | time | YES | | NULL | |
| DURATION | int(11) | YES | | NULL | |
| CHANNELID | int(11) | NO | | NULL | |
+-------------+-------------+------+-----+---------+-------+
channels
+-------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+----------------+
| CHANNELID | int(11) | NO | PRI | NULL | auto_increment |
| CHANNELNAME | varchar(50) | YES | | NULL | |
+-------------+-------------+------+-----+---------+----------------+
genre
+-----------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+----------------+
| GENREID | int(11) | NO | PRI | NULL | auto_increment |
| GENRENAME | varchar(50) | YES | | NULL | |
+-----------+-------------+------+-----+---------+----------------+
programme
+-------------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------+------+-----+---------+----------------+
| PROGRAMMEID | int(11) | NO | PRI | NULL | auto_increment |
| GENREID | int(11) | NO | MUL | NULL | |
| PROGTITLE | text | YES | | NULL | |
| YR | year(4) | YES | | NULL | |
| DIRECTOR | text | YES | | NULL | |
| PERFORMERS | text | YES | | NULL | |
| FILM | tinyint(1) | YES | | NULL | |
| WIDESCREEN | tinyint(1) | YES | | NULL | |
| BNW | tinyint(1) | YES | | NULL | |
| CERTIFICATE | varchar(5) | YES | | NULL | |
| DESCRIPTION | text | YES | | NULL | |
+-------------+------------+------+-----+---------+----------------+
episode
+-------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------+------+-----+---------+----------------+
| EPISODEID | int(11) | NO | PRI | NULL | auto_increment |
| PROGRAMMEID | int(11) | NO | MUL | NULL | |
| SUBTITLE | text | YES | | NULL | |
| EPISODE | text | YES | | NULL | |
| DIRECTOR | text | YES | | NULL | |
| PERFORMERS | text | YES | | NULL | |
| DESCRIPTION | text | YES | | NULL | |
+-------------+---------+------+-----+---------+----------------+
channelprogramme
+--------------------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+------------+------+-----+---------+----------------+
| CHANNELPROGRAMMEID | int(11) | NO | PRI | NULL | auto_increment |
| CHANNELID | int(11) | NO | MUL | NULL | |
| PROGRAMMEID | int(11) | NO | MUL | NULL | |
| EPISODEID | int(11) | NO | MUL | NULL | |
| RPEAT | tinyint(1) | YES | | NULL | |
| NEWSERIES | tinyint(1) | YES | | NULL | |
| PREMIERE | tinyint(1) | YES | | NULL | |
| CHOICE | tinyint(1) | YES | | NULL | |
| SUBTITLES | tinyint(1) | YES | | NULL | |
| DEAFSIGNED | tinyint(1) | YES | | NULL | |
| STARRATING | tinyint(4) | YES | | NULL | |
| PROGDATE | date | YES | | NULL | |
| STARTTIME | time | YES | | NULL | |
| ENDTIME | time | YES | | NULL | |
| DURATION | tinyint(4) | YES | | NULL | |
+--------------------+------------+------+-----+---------+----------------+
简单说一下背景,这个数据库是一个电视节目列表数据库,channelprogramme表里包含了每个频道上所有节目的条目,programme表里包含了每个节目的唯一条目,episode表里则包含了有剧集的节目的每一集的唯一条目。
我现在有点困惑,我想从临时表中选择不同的节目标题来填充programme表,但我还想要一些其他的信息。
举个例子,如果我这样做:
select distinct(progtitle) from tvtemptable;
这样的话,只会返回progtitle这一列的值,而我其实想要的是progtitle加上一些其他的信息。
所以如果我尝试选择更多的信息,比如这样:
mysql> select distinct progtitle, yr, director, film from tvtemptable
limit 30;
+-----------------------------------+------+---------------------+------+
| progtitle | yr | director | film |
+-----------------------------------+------+---------------------+------+
| Teleshopping | 2000 | | 0 |
| Cinemania | 2000 | | 0 |
| Whose Line Is It Anyway? | 2000 | | 0 |
| Just Desserts | 2004 | Kevin Connor | 1 |
| Law & Order | 2000 | Matthew Penn | 0 |
| Jane Doe: Yes, I Remember it Well | 2006 | Armand Mastroianni | 0 |
| CSI: NY | 2000 | David Jackson | 0 |
| CSI: Crime Scene Investigation | 2000 | Kenneth Fink | 0 |
| NCIS | 2000 | Colin Bucksey | 0 |
| CSI: Miami | 2000 | | 0 |
| Enter the Dragon | 1973 | Robert Clouse | 1 |
| Close | 2000 | | 0 |
| My Son Is Innocent | 1996 | Larry Elikann | 1 |
| Law & Order | 2000 | Christopher Misiano | 0 |
| Murder 101 | 2006 | Christian I Nyby II | 1 |
| CSI: NY | 2000 | Christine Moore | 0 |
| CSI: Crime Scene Investigation | 2000 | Bill Eagles | 0 |
| Rush Hour | 1998 | Brett Ratner | 1 |
| Dark Blue | 2000 | Jeffrey Hunt | 0 |
| CSI: Crime Scene Investigation | 2000 | Richard J Lewis | 0 |
| Ordinary Miracles | 2005 | Michael Switzer | 1 |
| Law & Order | 2000 | Jace Alexander | 0 |
| Wounded Heart | 1995 | Vic Sarin | 1 |
| CSI: NY | 2000 | Jonathan Glassner | 0 |
| Dark Blue | 2000 | Nathan Hope | 0 |
| Blade: The Series | 2000 | Michael Robison | 0 |
| K-Ville | 2000 | Kevin Dowling | 0 |
| Law & Order | 2000 | Jim Ellis | 0 |
| Reasons of the Heart | 1996 | Rick Jacobson | 1 |
| CSI: NY | 2000 | Anthony Hemingway | 0 |
+-----------------------------------+------+---------------------+------+
progtitle这一列会出现一些重复的值。因为每个新导演的节目并不会重复,某些节目因为有剧集,所以会有不同的导演。
现在如果我只需要额外的一列数据,我已经成功让选择工作正常,比如:
select distinct
t1.progtitle,
(select
t2.director
from
tvtemptable t2
where
t1.progtitle = t2.progtitle
limit 1) as "Director"
from
tvtemptable t1 limit 10;
+-----------------------------------+--------------------+
| progtitle | Director |
+-----------------------------------+--------------------+
| Teleshopping | |
| Cinemania | |
| Whose Line Is It Anyway? | |
| Just Desserts | Kevin Connor |
| Law & Order | Matthew Penn |
| Jane Doe: Yes, I Remember it Well | Armand Mastroianni |
| CSI: NY | David Jackson |
| CSI: Crime Scene Investigation | Kenneth Fink |
| NCIS | Colin Bucksey |
| CSI: Miami | |
+-----------------------------------+--------------------+
但如果我想选择多于一列的数据,这样就会变得很麻烦。
那么,最好的方法是什么呢?
这是我用来从临时表填充programme表的插入命令,目前在我上面提到的选择示例中也有重复的问题。
INSERT INTO PROGRAMME (
PROGTITLE, GENREID, YR, DIRECTOR,
PERFORMERS, FILM, WIDESCREEN, BNW,
CERTIFICATE, DESCRIPTION)
SELECT DISTINCT
T.PROGTITLE, G.GENREID, T.YR, T.DIRECTOR,
T.PERFORMERS, T.FILM, T.WIDESCREEN, T.BNW,
T.CERTIFICATE, T.DESCRIPTION
FROM
TVTEMPTABLE T
INNER JOIN GENRE G ON G.GENRENAME=T.GENRE
LEFT JOIN PROGRAMME P ON P.PROGTITLE=T.PROGTITLE
WHERE
P.PROGTITLE IS NULL
5 个回答
在编程中,有时候我们会遇到一些问题,特别是在使用某些工具或库的时候。这些问题可能会让我们感到困惑,尤其是当我们刚开始学习编程的时候。比如,有些错误信息可能看起来很复杂,但其实它们只是告诉我们哪里出了问题。
当我们在写代码时,可能会用到一些特定的语法或规则。如果我们不小心写错了,程序就会报错。这就像是写作文时拼错了字,老师会指出来,让我们改正。
有时候,解决问题的方法可能很简单,只需要仔细检查一下代码,看看有没有拼写错误或者遗漏的部分。就像是检查一份作业,确保每个步骤都做对了。
总之,编程就像是解谜游戏,遇到问题时不要着急,慢慢来,仔细分析,通常就能找到解决办法。
Insert into xx (a,b)
Select a.ab, b.bb from (
Select distinct ab from a) a, (select distinct bb from b) b
这段话虽然没有直接回答你的问题,但可能会对那些在网上搜索这个问题的人有帮助:
在Postgresql中,你可以使用SELECT DISTINCT ON来实现这个功能,比如:
SELECT DISTINCT ON (p.progtitle) p.* FROM progtitle p;
我不知道MySQL有没有类似的功能。
我可以理解为你的 TVTEMPTABLE
是从某些表中合并而来的结果吗?
如果是这样的话,我觉得你可以先把这个临时表拆分成多个临时表,这些临时表的格式和你目标表的格式是一样的,这样整体工作会更简单。
举个例子(我不知道你的数据,所以只是猜测):
-- All genres (matches your existing genres table)
create table genres_temp as
select distinct genre
from tvtemptable;
-- All programmes (matches your existing programme table)
create table programmes_temp as
select distinct all_the_programme_columns
from tvtemptable;
-- Contains the many-to-many relationship between genres and programmes
create table programme_genre_temp as
select distinct genre, progtitle
from tvtemptable;
上面的内容虽然没有直接回答你的问题,但可能会给你一些思路,帮助你解决问题。看起来你的程序信息在每个 某个东西 中都重复了。你需要找出这个“某个东西”是什么。
更新 那么每个节目是针对每个导演重复一次吗?那我之前的建议就更有道理了:先提取出实体和它们之间的关系。
更新 2
嗯,你把导演强行和节目关联在一起,但导演其实是和剧集相关的?这似乎是个建模错误。不过,如果你不在乎选哪个导演的话,你可以根据 progtitle
来分组,然后在其他列上使用 MAX()
。这样你就能得到不同的节目标题和其他列的一个值。
SELECT T.PROGTITLE, max(G.GENREID), max(T.YR), max(T.DIRECTOR), etc
FROM TVTEMPTABLE T
INNER JOIN GENRE G ON G.GENRENAME=T.GENRE
LEFT JOIN PROGRAMME P ON P.PROGTITLE=T.PROGTITLE
WHERE P.PROGTITLE IS NULL
group by T.PROGTITLE;