[nycphp-talk] Plagiarism Checker in PHP
Mutaz Musa
mutazmusa at gmail.com
Sun May 2 14:40:06 EDT 2010
Actually turnitin.com, a website where students turn in schoolwork, checks
for plagiarism and from what I understand does a pretty good job. They
compare both student-to-student and student-to-literature. I'm not sure how
they pull it off, probably a proprietary algorithm that tries to match
student submissions against their database of content where no match = no
plagiarism. It seems like a mammoth task though. Turnitin.com actually
crawls academic journals, books, etc. to populate their database.
If you're interested only in comparing submissions to your site against one
another that may be more manageable. If on the other hand you want to
compare against online/print content in general then that will prove a real
challenge. Finally, if you have the money maybe you can register with
turnitin.com - although my impression is that their primarily for
universities/colleges.
http://turnitin.com/static/index.html
Best,
Mutaz
On Sun, May 2, 2010 at 1:58 PM, Hans Zaunere <lists at zaunere.com> wrote:
> > Hi Friends,
>
> Hi,
>
> > I am developing blogger website using PHP & MySQL. And, I would like
> > add the feature of Plagiarism Checker.
> >
> > If any one posted an article in my website, that article must be unique
> > and original.
> >
> > Could you please tell / suggest me how to develop Plagiarism Checker
> > feature or send some useful articles / free APIs and so on.
>
> Hmm, I think there is a recursion problem here - if we tell you, then you'd
> be plagiarizing our solution :)
>
> > Expecting your positive reply ......
>
> Seriously though, unless I'm missing something, I can't see how this would
> be possible. I suppose you could use techniques such as comparing the
> number of similar words between articles, but that's not really exact, and
> likely to have incorrect results. Plus, you're looking to do this
> plagiarism check across the whole Internet?
>
> Even humans, who read different articles, have a hard time determining if
> something is actually plagiarized or not - I think doing it automatically
> is
> nearly impossible.
>
> H
>
>
> _______________________________________________
> New York PHP Users Group Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> http://www.nyphp.org/Show-Participation
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20100502/7cd7f7c2/attachment.html>
More information about the talk
mailing list