General discussion

All Comments

  • Author
    Replies
    • #3283790

      Better but

      by zlitocook ·

      In reply to Please help us beta test TechRepublic’s new search

      A bit slow and it should have a misspelled word function. If the word is close to what a word is then it could ask if this is what you mean. But it works better then the old search.
      What search engine are you using?

      • #3209339

        Working on Speed

        by jfpsf ·

        In reply to Better but

        We are working on speeding it up. You should see improvements in a day or two. I will look into adding common misspellings as synonyms.

    • #3283753

      It appears to work very well

      by nicknielsen ·

      In reply to Please help us beta test TechRepublic’s new search

      I have yet to not find an article or discussion with the search tags. Now all we have to work on is the tagging. For example, I searched TR for years for the page defining the emoticons, but could never find it. Why?

      Because the page is tagged “software” only. Go figure.

    • #3283643

      I like it

      by tig2 ·

      In reply to Please help us beta test TechRepublic’s new search

      I tried a number of references and oblique references based on the tagging habits I have seen.

      The difficulty will lie in getting peers to remember that their tags should be relavant to content. Good luck with that- there are a number of folks that didn’t find the existing search function helpful so did not tag posts relevant to content.

      And of course, Jaqui- “mandatory tagging sucks”.

      Spell suggestion or spell check would be very helpful- especially to chronic mis-spellers who shall remain nameless…

      • #3283639

        Not just user tags

        by jfpsf ·

        In reply to I like it

        The search doesn’t rely only on tagging by TR users. We have built software that tags all the content by extracting all the statistically significant phrases from the text. That is what is providing the majority of the tags.

        • #3209266

          Then I missed something

          by tig2 ·

          In reply to Not just user tags

          And will go back and re-try. I searched on “cancer” because I was reasonably certain of discussions on this topic. It didn’t surprise me when those discussions were yielded, nor was I surprised to see white papers that used the term in their titles. I did expect to see a blog but also am aware that blogs are being phased out and you may not accomodate them.

          I wasn’t surprised when the “Join the Pink Ribbon Brigade” didn’t show as I didn’t tag it for cancer, but if the software should have discerned that content, then I am curious as to what I should have expected.

          I shall have to continue to play with this…

        • #3209205

          Limits of Software

          by jfpsf ·

          In reply to Then I missed something

          The software will only tag the content if the phrase appears often enough.

        • #3209139

          That makes sense then

          by tig2 ·

          In reply to Limits of Software

          I believe that the actual phrase occurred once or perhaps twice in the originating post and comparitively few times thereafter.

          Now the results make sense. I appreciate you clarifying that for me!

        • #3209134

          Try Pink Ribbon Now

          by jfpsf ·

          In reply to That makes sense then

          The main problem appears to be that the software decided “ribbon” was a tag, but not “pink ribbon.” Generally, we try and strip out adjectives, and for the most part that is a good rule. Otherwise we don’t identify enough tags.

          We had a failover search feature for when there were no tag results that I had taken out for performance reasons. I am happier with the performance now, so I have put it back. So, try pink ribbon now.

        • #3208985

          Nice!

          by tig2 ·

          In reply to Try Pink Ribbon Now

          The response was quite good and found what I thought it should as well as some things I didn’t know existed.

          I think that the speed is fine but that’s just me. Nice work!

    • #3209252

      Thank goodness it’s finally launching

      by rexworld ·

      In reply to Please help us beta test TechRepublic’s new search

      I was working on this tag search like six months ago, before I left TR. I’m glad you guys finally are launching it. Though clearly somebody has improved, if not completely replaced, my crappy code because this one runs really quickly 🙂

      I’m heavily biased of course but I love this search. It’s not just incrementally better than the existing search, it’s orders of magnitude better I think.

      My only input is that having the number of results in the the filter drop-down is confusing a bit. Because it almost makes it seem as though when I type in a new search term it’s going to only search within those results in the drop-down. Instead it’s doing a whole new search across the entire selected category.

      • #3209245

        No full-text failover?

        by rexworld ·

        In reply to Thank goodness it’s finally launching

        Okay one other comment–I think the lack of a full-text failover is bad. Because one thing that folks use a lot is searching for a person’s name. For example if I want to search for threads from or about “Sonja”, I can’t do that with this new tag search page.

        You need to integrate full-text with tag-search, can’t have just tag-search.

        • #3209206

          full-text failover

          by jfpsf ·

          In reply to No full-text failover?

          full-text backup was removed while I sped it up. It’s back now, Rex. And, your code was pretty good.

    • #3209172
      Avatar photo

      OK I waited a bit till you got it faster

      by hal 9000 ·

      In reply to Please help us beta test TechRepublic’s new search

      Well for Discussions it works fine even if it did find that [b]Horrible Green Festering Monster[/b] that is taking over your servers called [b]The Evolution Lie.[/b]

      It seems quite OK for the discussions but seems to mis some in the White Papers though to be fair the one that I was looking for is quite old and may no longer exist as I found several that had been removed from your Servers.

      Over all it’s a great improvement to the old search engine and I can see that with some more work that has been said will be done it will be a massive improvement to TR.

      Great Job Guys!

      Col

      • #3209132

        White Paper Examples

        by jfpsf ·

        In reply to OK I waited a bit till you got it faster

        Can you give me some examples of white papers you couldn’t find, because the search index includes the entire white paper database. Maybe, I can figure out the problem.

        • #3209113
          Avatar photo

          The main one that I couldn’t find was

          by hal 9000 ·

          In reply to White Paper Examples

          A great TR publication on Consulting Agreements. It was basically a template for Consultants to use to set up their Service Agreements with Companies. I used to point various companies to it as a way to look at how a Consultant should be approached but for some reason I lost the link.

          While there are other good Consultant Agreements listed this TR one was defiantly the best but as I looked through the results of the search I noticed that several no longer where in the DBase so maybe it’s been removed to make way for something newer.

          Col

        • #3209106
        • #3209069
          Avatar photo

          Sorry no that’s not it

          by hal 9000 ·

          In reply to Is this it?

          The one that I was looking for was a PDF only file.

          JD I’ll dig through my stored White Papers and try to find it again and post you a link when I manage to find it.

          Col

        • #3208990
          Avatar photo

          JP It’s not currently on this system

          by hal 9000 ·

          In reply to Sorry no that’s not it

          So I’ll have to dig through some of the generational Backups to see if I still have a copy. That might take a bit of time to do.

          However the one that I was looking for was more of a Generalisation of what should be in a Consultants agreement not a [b]Fill In The Blanks[/b] type thing.

          Col

        • #3205729

          We are looking too

          by jfpsf ·

          In reply to JP It’s not currently on this system

          We have the TR editors looking for this now.

        • #3226971
          Avatar photo

          JP if this is of any help

          by hal 9000 ·

          In reply to JP It’s not currently on this system

          I haven’t found a copy of the thing but I haven’t had time to work through all the PDF files either yet.

          But I first saw this in 2003- 2004 time frame and think that it was a Service Agreement or Independent/Outside Consultants Agreement. Of course if they had names and not numbers it would be easier to find. 🙂

          Col

    • #3209156

      What I would appreciate, would be to have a search

      by deadly ernest ·

      In reply to Please help us beta test TechRepublic’s new search

      able to filter on Thread, Post Title, and Poster, with a wildcard if one of them is vacant. Also being able to set a time frame for the search would be nice, but not as important as the other.

      Often I will remember the thread title subject, or the post title subject, or part of one of them, usually I’ll remember who posted it.

      A perfect example is my current problem. I’m trying to find a post I made some months back where I costed the, then values, of a Vista ready machine against an iMac. I have used the search to view numerous threads but still can’t find this particular post in any of the threads.

      If I could run a search along the lines of:

      Thread Title = Vista or Cost

      Post Titles = Vista or Cost

      Poster = Deadly Ernest

      Then many of the threads and post would not appear. It would not matter if the results page only listed the threads, as opening that Thread page would list all the posts within it and a scan of them should trigger my memory, in the end I just open all my posts in that thread. But it would eliminate all the threads on Vista that I did NOT participate in.

      In the above search algorhythm, a blank should always = Anything wildcard

    • #3209081

      Search for emoticons

      by ontheropes ·

      In reply to Please help us beta test TechRepublic’s new search

    • #3209017

      This does not work in the slightest.

      by justin james ·

      In reply to Please help us beta test TechRepublic’s new search

      “AII” does not find the blog comment I wrote containing that phrase. “Justin James” only finds one item, the T1000 review I wrote. “Justin +James” does not bring up anything I wrote at all. “J.Ja” does not bring up any results, despite it appearing in everything I ever wrote for this site except for articles. “Justin AND James” does not return results. “”Justin James”” brings up results, but none for me. The results it brings up all simply have the word “James”. “Usability” does not bring up a single one of my blog posts or discussion items or blog comments, despite it being something that I am constantly writing about. Relevancy should more heavily weight the words in the query as the query words nearly always decrease in importance from right to left in the searcher’s mind; instead, all words in the query are weighted equally. As another commenter already pointed out, you also need to be using a soundex algorithm and/or a spell checker to be catching mispellings, typos, and mistakes.

      This search does not work right at all, where I sit. It actually pulls up less items than the current search system.

      Tag search is worthless, for the following reasons:

      * Tags are not standardized. I may use “software development” as a tag, the search may look for “programming.”

      * No automatic thesaurus for tagging, with a proper weighted graph of meaning. “Software development” would be just as important as “programming” if the user searches for “programming”, but a more obscure synonym for “software development” would be less important. Without the thesaurus, and standardized tags, the system lacks all use if it prioritizes tags over all else.

      * Limited tags. TechRepublic limits the length of the tag line, so I cannot even manually type in all of the applicable synonyms into the tags.

      * Emphasis on filtering full text search by the frequency of occurance is bad. If I write an article about “Timothy Walters” and refers to him throughout the article as “Mr. Walters” with the exception of the initial paragrah saying “Timothy Walters,” your new search system will not find my article about “Timothy Walters” because that name only appears once, despite my whole article being about him.

      As such, the system MUST be using full text search, not as a failover, but it must be using full text search with tagging to add additional weight, provided the conditions I specify above are met (and that is just the tip of the iceberg!).

      “Folksonomy” as an idea is sexy, hip, cool, and has the aura of “democracy.” In reality, it is nearly useless in terms of helping users. There is plenty of research out there to support that statement.

      I heartily recommend the following article about vector-based search systems before attempting to write any search engine: http://www.perl.com/pub/a/2003/02/19/engine.html I also highly recommend a THOROUGH reading of Jakob Neilsen’s research into search habits (www.alertbox.com). He has put in countless hours getting detailed statistical models of how users actually use search. Search Engine Watch is another good resource as well.

      Attempting to shoehorn quality search within the confines of standard SQL is nearly impossible. I do not recommend it in the slightest. I can almost imagine how the search SQL is written, too, because I wrote something along these lines a few months ago, but at least the data was not actual content, and therefore tricky to search, it was mostly numerical and address searching. Trying to use SQL for something like this is hopeless. You probably want to be using a functional programming language such as Lisp, Scheme, O’Caml, etc. to implement a search, or one that has a lot of functional programming elements in it like Perl or Ruby. Trying to write search in an object oriented language like Java is like trying to move a pile of sand with a pair of tweezers, when there is a shovel two feet away that you are not allowed to touch.

      Also, the results returned are hard to read. Much better than giving the first few lines of an item is giving the word/phrases itself, surroudned by context, which lets the searcher more easily know if the content is really what they need. Sometimes we only need one sentence, or only remember one phrase, the phrase that we are looking for.

      A number of years ago, I wrote a basic search system (for an ecommerce package I was sort of working on) that could at least parse out Boolean operators (both symbolic and using words). It is in Perl, but should translatable fairly easily into another language. I would be happy to send it to you guys to use.

      J.Ja

      • #3205823

        addressing your points

        by peter spande ·

        In reply to This does not work in the slightest.

        You’re right to say that user tagging alone will not work. We’ve done two things to add to this –

        1. We have machine and editor tagging added to the search tables
        2. We have full text search as a back up.

        We’re also including related tags into this so that someone search for “Windows” can standardize their search on “Microsoft Windows.”

        As for the author elements, this is one of the reasons we are switching over to this search yet but asking members to take it for a test drive. Stay tuned and thanks for pushing us to make this better.

      • #3284827

        some more points

        by tp_cnet ·

        In reply to This does not work in the slightest.

        J.Ja, thanks for your detailed reply.

        To reiterate and elaborate a bit upon what JP_CNET and Peter have noted, we’ve built a hybrid search system that combines tags and a full-text search engine in an effort to leverage the best of both worlds.

        The tags used by the system are a combination of member-created tags, editor-created tags and software-generated tags. These tags not only help with keyword-oriented searches, but also let us show you similar or closely related tags to help you guide your search. The current notion of related-ness is based on content alone, but in future we hope to use additional methods to expand and refine the list of related tags.

        We will consider the use of soundex and/or spelling correction algorithms to handle typos. We also plan to index author names so you’ll be able to search by name.

        As a point of fact, our content analysis engine, which does the software tagging and is based on a vector space model, does attempt to map “Walters” to “Timothy Walters” etc, when it discerns (based on specific text patterns) that the latter is likely the name of a person. This is not very foolproof however since we need to err on the side of caution to avoid false positives.

        We do read Search Engine Watch when we have time 🙂

        • #3199010

          Excellent!

          by justin james ·

          In reply to some more points

          I am really happy that you guys are revamping/improving the search, and taking these suggestions seriously. It is really frustrating to know that something actually exists on the site and to not be able to find it. I look forwards to seeing the finished product!

          J.Ja

        • #3198894

          Boolean Full Text Searches

          by jfpsf ·

          In reply to Excellent!

          Justin,

          I just rolled out boolean full text search to test. You can now enter “ibm service” to search for exact phrase, or +ibm -microsoft to search for documents that contain ibm, but not microsoft, etc.

          This bypasses the tag search and goes right to full-text.

          The full-text index is not ideal yet, but it is heading in the right direction.

        • #3200261

          Hmm.

          by justin james ·

          In reply to Boolean Full Text Searches

          It definitely seems to work, to an extent. It still has a few problems:

          1. English versions of the Boolean do not work (for example, “Justin AND James”).

          2. It does not consider the username as part of the text, and since it is not a tag either, a search for “Justin James” will only find items where that text is in the content itself or the tags, but not anything that I have actually written.

          3. Quoted queries do not function, for example, “”Justin James”” does not work.

          4. It only seems to be doing full text search on articles, not blogs or discussions.

          Hope this helps, and thanks for working so hard to deliver better search!

          J.Ja

        • #3228179

          Response to your points

          by jfpsf ·

          In reply to Hmm.

          Sorry to take so long to respond, but I have been out of the office.

          1. No, that will do a dual tag search. Only the Boolean operators work.

          2. We are adding user name as a tag to content. Just taking a little time.

          3 and 4. It works, but not all of the content is there to be searched yet.

    • #3201477

      New Features

      by jfpsf ·

      In reply to Please help us beta test TechRepublic’s new search

      We have added several new features to the search.

      1. Corrections of miss-spellings, and indistinct search terms

      2. Boolean full text support. For example, “Network Administrator” to search for that exact phrase. +IBM -DB2 to search for all documents containing IBM, but not containing DB2.

      Please, try them out and give us your thoughts.

      • #3200260

        One more bug…

        by justin james ·

        In reply to New Features

        It does not work properly at all when a period is part of a search token. For example, “J.Ja” returns a zillion articles about Java.

        J.Ja

Viewing 8 reply threads