General discussion

  • Creator
    Topic
  • #2318387

    Perl: Finding non-duplicate elements

    Locked

    by maryweilage ·

    The June 3 Perl e-newsletter described how to find non-duplicate elements in a list. Will you use this technique to identify and remove duplicate elements in a list? Is there another technique you prefer? Please share what method works for you.

    If you aren’t already subscribed to the Perl e-newsletter, visit our e-newsletter subcenter to sign up for this free TechMail today:
    http://builder.com.com/techmails.jhtml?repID=u001

All Comments

  • Author
    Replies
    • #3372363

      There is more than one way to do it

      by swstephe ·

      In reply to Perl: Finding non-duplicate elements

      As always, there is more than one way to do it. I would think that using grep would be more efficient than using a hash table with flags. In that case, what you want is to get a list of all elements in a2 that don’t exist in a1. I wish I rememberhow to alias the $_ parameter so I could do it in two nested grep’s, but that might be less readable. The complementary common function does the same, but with one small switch in the logic.


      @a1
      = qw(one two three four five six);

      @a2
      = qw(one three five seven nine);

      # what is in common -> @c
      foreach $a (@a2) { push @c,$a if grep {$a eq $_} @a1; }
      print “\@c = (“,join(‘,’,@c),”)\n”;

      # what is different -> @d
      foreach $a (@a2) { push @d,$a unless grep {$a eq $_} @a1; }
      print “\@d = (“,join(‘,’,@d),”)\n”;

    • #3372358

      large lists = heavy memory load

      by zkent ·

      In reply to Perl: Finding non-duplicate elements

      I used this technique once on a list of several thousand email recipients. I loaded their user_ids into a @found array so that I would not send the email twice to the same person. but by the time the list loaded into the array (it was being loaded from a flatfile), the server killed the process. I had to find another method since arrays are memory intensive. For short lists, it is an easy, effective method.

Viewing 1 reply thread