
2024-04-25 13:42:49 发布

您现在位置:Python中文网/ 问答频道 /正文



d ={
      "translation using information on dialogue participants",
      "translation using information on dialogue participants",
      "translation using information on dialogue participants",
      "#emotional tweets",
      "#emotional tweets",
      "#emotional tweets",
      "#supportthecause: identifying motivations to participate in online health campaigns",
      "#supportthecause: identifying motivations to participate in online health campaigns",
      "#supportthecause: identifying motivations to participate in online health campaigns"
      "beattie, gs (2005, november) #supportthecause: identifying motivations to participate in online health campaigns may 31, 2017, from",
      "burton, n (2012, june 5) depressive realism retrieved may 31, 2017, from",
      "gotlib, i h, 27 hammen, c l (1992) #supportthecause: identifying motivations to participate in online health campaigns new york: wiley",
      "paul ekman 1992 an argument for basic emotions cognition and emotion, 6(3):169200",
      "saif m mohammad 2012a #tagspace: semantic embeddings from hashtags in mail and books to appear in decision support systems",
      "robert plutchik 1985 on emotion: the chickenand-egg problem revisited motivation and emotion, 9(2):197200",
      "alastair iain johnston, rawi abdelal, yoshiko herrera, and rose mcdermott, editors 2009 translation using information on dialogue participants cambridge university press",
      "j richard landis and gary g koch 1977 the measurement of observer agreement for categorical data biometrics, 33(1):159174",
      "tomas mikolov, kai chen, greg corrado, and jeffrey dean 2013  #emotional tweets arxiv:13013781"


import pandas as pd





Expected Result

And finally the final result dataframe with unique values as:




def return_id(paper_title,reference, _id):
    if (paper_title is None) or (reference is None):
        return None
    if paper_title in reference:
        return _id
        return None

df1['paper_present_in'] = df1.apply(lambda row: return_id(row['paper_title'], row['reference'], row['_id']), axis=1)

Tags: andtoinidtitleononlinepaper
1楼 · 发布于 2024-04-25 13:42:49


# A list to store unique paper titles

# A dict to store mapping of unique paper to unique ids
mapping_dict_paper_to_id = dict()

# A dict to store mapping unique idx to the ids
mapping_id_to_idx = dict()

# This gives us the unique paper title's list
unique_paper_title = df["paper_title"].unique()

# Storing values in the dict mapping_dict_paper_to_id

for value in unique_paper_title:
    mapping_dict_paper_to_id[value] = df["_id"][df["paper_title"]==value].unique()[0]

# Storing values in the dict mapping_id_to_idx

for value in unique_paper_title:

    # this gives us the indexes of the matched string ie. the paper_title
    idx_list = df[df['reference'].str.contains(value)].index

    # Storing values in the dictionary
    for idx in idx_list:
        mapping_id_to_idx[idx] = mapping_dict_paper_to_id[value]

# This loops check if the index have any refernce's id and then updates the paper_present_in field accordingly

for i in df.index:
    if i in mapping_id_to_idx:
        df['paper_present_in'][i] = mapping_id_to_idx[i]
        df['paper_present_in'][i] = "None"


相关问题 更多 >