Monday, January 10, 2022

[FIXED] Trying to create individual integer values for lists of tuples

January 10, 2022 anaconda, python, spyder No comments

Issue

I'm trying to create a rudimentary sentiment analyzer. I have lists of words in categories, and two csv files from reddit threads which I'm taking comments from. I've managed to tag my data sets with the appropriate tags, and I now have sets of tuples in lists of lists which are separated by comments. I have a piece of code which I hoped to use to make an integer value for each comment based on the tags present, however I'm hitting a brick wall mentally.

I've tried the below code which results in a 0 at best, and a ValueError at worst. I know it's gotta be chock full of bad ideas, but I'm at a loss. At this point I just want something to FUNCTION T_T

tLOTR = [[('terrible', 'negative'),
  ('so', 'intensifier'),
  ('awesome', 'positive'),
  ('so', 'intensifier'),
  ('but', 'shifter'),
  ('agree', 'positive'),
  ('like', 'positive'),
  ('really', 'intensifier'),
  ('but', 'shifter'),
  ('but', 'shifter'),
  ('so', 'intensifier'),
  ('not', 'shifter'),
  ('like', 'positive'),
  ('really', 'intensifier'),
  ('like', 'positive'),
  ('so', 'intensifier')],
 [('not', 'shifter'),
  ('amazing', 'positive'),
  ('but', 'shifter'),
  ('bad', 'negative'),
  ('but', 'shifter'),
  ('like', 'positive'),
  ('awful', 'negative'),
  ('but', 'shifter'),
  ('like', 'positive'),
  ('but', 'shifter'),
  ('so', 'intensifier'),
  ('completely', 'intensifier'),
  ('wrong', 'negative')]]

#this is just a few of my tagged sets

def sentalize(text):
    value = 0
    for x in text:
        for (word, tag) in x:
            if tag == "positive":
                value += 1
            elif tag == "negative":
                value -= 1
            elif tag == "shifter":
                value *= -1
            elif tag == "intensifier":
                value *= 1.25
    return value

So I'm getting either 0 or ValueError when I run a single thing (tLOTR[0] for instance) - what I'd like ideally is to have a list of values for each comment (comment 1 = -0.348) or something of the sort.

Solution

Assuming you want sentalize() to process individual elements of tLOTR, your problem is the loop:

def sentalize(text):
    value = 0
    for word, tag in text:
        if tag == "positive":
            value += 1
        elif tag == "negative":
            value -= 1
        elif tag == "shifter":
            value *= -1
        elif tag == "intensifier":
            value *= 1.25
    return value


print(sentalize(tLOTR[0]))

Note how word, tag can be captured in one line by iterating over text, instead of first extracting a tuple x and then trying to somehow loop over the components of that tuple, like in your example.

With that change you can do: values = list(map(sentalize, tLOTR)) and get the result [-2.833251953125, 0.5625]

Some additional comments:

storing each word with its type as a string (i.e. "positive", "negative", etc.) is not very efficient; instead, consider representing that with a simpler value
since you've already parsed the comments and have apparently matched each word with the type of modifier / tag, that would possibly be the right time to update value, instead of having this tLOTR list of intermediate values.
combining operators like -= and += with positive and negative constant values like 1 and -1 is very confusing. I'd recommend only using += and *= and using negative or positive values where appropriate.

Answered By - Grismar

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, January 10, 2022

[FIXED] Trying to create individual integer values for lists of tuples

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels