===== Paper summary =====

This paper proposes using clickstream analysis to detect Sybils in
online social networks (OSNs).  The authors demonstrate using
real-world datasets taken from Renren that unsupervised clustering
techniques can detect Sybils with low false positive rates (~1%) and
low false negative rates (~3-10%).

                        ===== Paper strengths =====

+ The paper considers a number of practical issues related to Sybil
  detection, and shows how the proposed technique is highly scalable
  and appropriate for large online social networking sites with 100s
  of millions of users.

+ The technique has been applied by security teams at Renren and
  LinkedIn, with apparently promising results.

+ The evaluation is thorough and is mostly convincing (limitations are
  discussed below).

                       ===== Paper weaknesses =====

- It's unclear that the approach achieves sufficiently low FPs to be
  useful.  A ~1% FPR for a 220M user site (Renren) translates to
  approximately 2M false positives, far more than can be resolved
  through any reasonable manual process.  A brief discussion of how
  this many FPs can be handled is warranted (e.g., through some
  secondary verification process).

- As the authors note, Clickstream analysis has been applied in
  several similar settings to detect misbehavior.  The paper's
  techniques are shown to be effective (modulo the above comments
  about the FPR), but are not especially novel.

                      ===== Comments for author =====

Overall, the paper makes a convincing argument that clickstream
analysis is a useful and relatively accurate method of performing
anomaly detection on OSNs.  

As an editorial comment, I suggest the authors include a roadmap in
the introduction.  While I'm not normally a fan of these mostly
superfluous paragraphs, it's hard not to read the paper and realize
some obvious problems (e.g.: where would one obtain ground truth?, how
would this approach scale to 100s of millions of users, how does the
detection mechanism react to changing trends?, etc.), only to have
these issues addressed in the next section.  The paper becomes more
and more convincing as you read it, but it'd be helpful to the reader
to state upfront that each section works towards a more comprehensive
and practical solution.

The paper could also be improved by discussing the limitations of the
dataset.  For example, I wonder about the patterns of non-Sybil new
users versus non-Sybil users who are already well-established on the
OSN.  For example, I suspect that the activities of a new Renren user
will look vastly different from those of a seasoned user.  Likewise,
organization users (e.g., business accounts, etc.) also likely exhibit
very different usage patterns.  It would be interesting to dive a bit
deeper into what types of users are covered in your dataset, and if
the dataset lacks diversity, discuss how that might be addressed.

Similarly, it would be interesting to learn more about the false
positives.  Did manual inspection reveal anything particular about
these accounts?

The dataset contains users who were banned during the course of data
collection.  It would strengthen the paper to add a discussion on how
this might affect your results.

It would be helpful to describe in more detail what type(s) of
misbehavior you are trying to detect.  Sybils might be useful for
spamming or for bolstering a targeted online profile.  The paper
implies that the proposed solution is intended to identify spammers,
while the technique could be useful for detecting any accounts that
deviate from some norm(s).  

The "Accuracy over Time" subsection is unconvincing, given the very
limited timespan that was investigated.  

On page 11, the paper states that 4% of normal users are necessary to
color 99% of normal clusters.  4% of 220M is 9M.  I'm highly skeptical
that Renren can easily identify 9M normal ("representative") users.

The related work section is weak and seems rushed.  In particular, why
is it not straightforward to adapt the techniques from [13,27] and
apply them to OSNs?