U.S. patent application number 14/225405 was filed with the patent office on 2014-07-24 for recruiting service graphical user interface.
This patent application is currently assigned to GILD, Inc.. The applicant listed for this patent is GILD, Inc.. Invention is credited to Luca BONMASSAR, John Dane Smilanick.
Application Number | 20140207699 14/225405 |
Document ID | / |
Family ID | 49478198 |
Filed Date | 2014-07-24 |
United States Patent
Application |
20140207699 |
Kind Code |
A1 |
BONMASSAR; Luca ; et
al. |
July 24, 2014 |
RECRUITING SERVICE GRAPHICAL USER INTERFACE
Abstract
A recruiting service is disclosed that generates profiles of
software developers having specific skills. Public code
repositories are examined to identify projects of software
developers. The projects are analyzed to estimate the number of
years of experience a software developer has with an individual
language and determine a score with respect to other developers.
Social media information and a messaging link may also be provided
with each profile. A graphical user interface for displaying the
information is disclosed.
Inventors: |
BONMASSAR; Luca; (Marina di
Massa, IT) ; Smilanick; John Dane; (Santa Clara,
CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
GILD, Inc. |
San Francisco |
CA |
US |
|
|
Assignee: |
GILD, Inc.
San Francisco
CA
|
Family ID: |
49478198 |
Appl. No.: |
14/225405 |
Filed: |
March 25, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13493791 |
Jun 11, 2012 |
8719179 |
|
|
14225405 |
|
|
|
|
61640656 |
Apr 30, 2012 |
|
|
|
Current U.S.
Class: |
705/321 |
Current CPC
Class: |
G06F 16/334 20190101;
G06F 3/04842 20130101; G06Q 10/1053 20130101 |
Class at
Publication: |
705/321 |
International
Class: |
G06Q 10/10 20060101
G06Q010/10 |
Claims
1. A computer implemented method of providing information for
recruiting software developers executed on a computer including a
processor, a memory and a network interface, comprising: receiving,
in the processor, via the network interface, commit logs associated
with software projects wherein each commit log identifies software
developers which have contributed to the software projects
including what lines of software code each of the software
developers contributed and when the lines of the software code were
contributed; selecting, in the processor, a first software
developer from among the software developers; locating, in the
processor, from within the commit logs a plurality of instances
where the first software developer has contributed the software
code to one or more of the software projects; determining, in the
processor, how many of the lines of the software code the first
software developer has contributed in each of the plurality of
instances; determining, in the processor, when each of the
plurality of instances were contributed; determining, in the
processor, what programming language each of plurality instances
were contributed; and based upon how many lines of the software
code were contributed in each of the plurality of instances and
when each of the plurality of instances were contributed,
estimating a number of years of experience of the software
developer in at least one programming language.
2. The method of claim 1, further comprising creating a profile
associated with the first developer and storing to the profile the
number of years of experience in the at least one programming
language and identification information associated with the first
developer to the profile.
3. The method of claim 2, further comprising, based upon the
identification information, retrieving, from a social media site,
additional information about the first developer and storing the
additional information to the profile.
4. The method of claim 2, wherein the additional information
includes one or more selectable links, which, when selected, cause
information about one or more instances to be displayed.
5. The method of claim 4, wherein the information about the one or
more instances includes the lines of the software code associated
with the one or more instances.
6. The method of claim 1, further comprising retrieving first
software code associated with one or more of the instances,
analyzing the first software code, wherein the number of years of
experience of the software developer in the at least one
programming language is based upon the analyzing of the first
software code.
7. The method of claim 6, further comprising analyzing a quality of
the first software code.
8. The method of claim 7, further comprising determining a skill
level of the first software developer based the quality of the
first software code.
9. The method of claim 1, further comprising determining how many
times first software code associated with one or more of the
instances has been viewed by other software developers.
10. The method of claim 1, further comprising determining how many
times first software code associated with one or more of the
instances has been copied and incorporated into one or more
additional software projects different from a first software
project in which the first software code was used.
11. The method of claim 1, further comprising estimating the number
of years of experience of the first software developer in a
plurality of programming languages.
12. The method of claim 11, further comprising estimating an
overall number of years of experience of the first software
developer based upon the number of years of experience estimated in
each of the plurality of programming languages.
13. The method of claim 1, further comprising determining a
frequency at which the plurality of instances were contributed
wherein the number of years of experience of the software developer
in at least one programming language is estimated based upon the
frequency.
14. The method of claim 13 wherein more or less number of years of
experience is attributed to the first software developer based upon
the frequency.
15. The method of claim 1, further comprising, determining a large
number of lines of the software code where contributed in a short
time period and reducing a contribution of the large number of
lines of the software code to the number of years of experience
which are estimated.
16. The method of claim 1, further comprising estimating the number
of years of experience of the software developer in at least one
programming language at a first time, receiving updated commit
logs, locating, in the processor, from within the updated commit
logs one or more of new instances where the first software
developer has contributed the software code and estimating, at a
second time, the number of years of experience based upon the
plurality of instances and the one or more new instances.
17. The method of claim 1, further comprising estimating the number
of years of experience for each of a plurality of the software
developers.
18. The method of claim 1, further comprising determining a rate at
which the lines of the software code were generated wherein the
number of years of experience is based upon the rate.
19. The method of claim 1, wherein the commit logs are retrieved
from one or more public code repositories.
20. The method of claim 1, wherein new commits associated with the
commit logs are retrieved on a periodic basis and the number of
years of experience is updated on the periodic basis.
21. The method of claim 1, further comprising determining a number
of collaborators on a first software project associated with a
first instance wherein the number of years of experience which is
estimated is based upon the number of collaborators.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority under 35 U.S.C. .sctn.120
and is a Continuation of co-pending U.S. patent application Ser.
No. 13/493,791, filed Jun. 11, 2012, by Bonmassar et al, which
claims priority under 35 U.S.C. .sctn.119(e) to U.S. Provisional
Application No. 61/640,656, filed Apr. 30, 2012, entitled
RECRUITING SERVICE GRAPHICAL USER INTERFACE, the contents of each
of which are hereby incorporated by reference.
FIELD OF THE INVENTION
[0002] The present invention is generally related to employment
recruitment tools. More particularly, the present invention is
directed to a user interface, search technology, and scoring
technique to automatically provide information to aid in recruiting
software developers.
BACKGROUND OF THE INVENTION
[0003] Recruiting skilled software developers is a difficult task.
How does one find qualified candidates? Many of the conventional
recruiting approaches based on reviewing resumes do not work well
for recruiting software developers.
[0004] One of the problems in the prior art is identifying
individuals who have expertise in specific software languages as
well as the passion and ingenuity to solve specific problems. How
does a recruiter evaluate the actual skills and talents of
prospective candidates? Education alone is not adequate to
determine actual talent. Nor is the number of years working in
industry a good measure of talent.
[0005] Another problem in the prior art is identifying whether a
candidate for a software development position will be a good social
fit for a company. Conventional resumes do not provide a good
indicator of the social fit of a candidate.
[0006] One aspect of these problems in the prior art is that it is
difficult to perform a pre-screening to identify talented
candidates to fill a software development position. As a result,
many companies waste enormous amounts of time trying to find
qualified candidates to fill software development positions.
Additionally, the difficulties in assessing the actual talent of a
candidate means that companies sometimes end up with employees that
cannot perform as expected.
SUMMARY OF THE INVENTION
[0007] A recruiting service generates a graphical user interface in
response to a query. The graphical user interface provides profile
information for software developers. The profile information
includes a ranking based on analysis of public code repositories
and may also include other information regarding the knowledge,
experience, and influence of a developer. The profile information
may also be augmented with additional social media information,
such as social media links for a developer. A messaging link may
also optionally be provided to contact a developer. The recruiting
service thus permits a user to input a query to find software
developers with specific skills and receive a graphical user
interface providing objective evaluation information based on the
code written by the developer.
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] FIG. 1 is a high level diagram illustrating a recruiting
service for software developers in accordance with an embodiment of
the present invention.
[0009] FIG. 2 is a screenshot illustrating a graphical user
interface displaying an initial listing of profiles matching a
query in accordance with an embodiment of the present
invention.
[0010] FIG. 3 is a screenshot illustrating the graphical user
displaying a first portion of an individual profile in accordance
with an embodiment of the present invention.
[0011] FIG. 4 is a screen shot illustrating the graphical user
interface displaying a second portion of an individual profile in
accordance with an embodiment of the present invention.
[0012] FIG. 5 is a block diagram of a recruiting service in
accordance with an embodiment of the present invention.
DETAILED DESCRIPTION
[0013] FIG. 1 is a high level system diagram of a recruiting
service 100 in accordance with an embodiment of the present
invention. The recruiting service 100 is a computer-implemented
service that may include one or more servers and associated
hardware, such as computer processors, a database, and a memory for
storing computer program instructions. The recruiting service 100
accesses information sources on the Internet to obtain information
on software developers to develop profile information that includes
information about the skills and experience of software
developers.
[0014] In some situations a large organization could maintain the
recruiting service as an in-house tool available to users within
the organization via a local area network or Intranet. However,
more generally the recruiting service may be implemented as a
web-hosted service available over the Internet to individuals,
companies, or organizations seeking to obtain information on
potential candidates for software development positions.
[0015] An exemplary set of Internet information sources is
illustrated in FIG. 1. One aspect of the present invention is that
code repositories 105, such as public code repositories, are
searched. Public code repositories are repositories in which
programmers can store a software project that they have worked on
in an individual repository for others to view and comment on.
Examples of repositories include Github, Inc. of San Francisco,
Calif., which permits a programmer to push source code to a
repository so that it is accessible and transparent to others. In
Github, each project in a repository includes a file history
listing each commit that changed the file along with the author (or
authors) for each commit. Other examples include sites operated by
companies and organizations such as Bit Bucket, Google Code of
Google, Inc. of Mountain View Calif., Source Forge, Launch Pad, and
Type, the Apache foundation and the Mozilla foundation.
[0016] At least one other source of information is preferably
accessed to obtain additional information for each profile. Another
potential source of information the recruiting service can access
are forum and discussion groups 110 used by programmers, such as
Stack Overflow (operated by Stack Exchange, Inc. of New York,
N.Y.), and news groups like Hacker News (a social news website
about hacking and startup companies) or Android developer mailing
lists. For example, forum and discussion groups may be used to
provide a source of information on the reputation and influence of
individual developers. Additionally another option is for the
recruiting service to access general or social media sites 115,
such as those provided by companies such as Facebook, Inc. and
LinkedIn, Inc. Another source of information are contact services
and social intelligence services 120, which provide resources to
identify individuals from partial contact information and otherwise
expand an initial set of contact information into a wider set of
contact information and social information from which links to
social media can be determined. More generally, other public
information sources 125 may also be searched as well that are
relevant to determining the influence, skills, or biographical
information about software developers, such as professional network
sites.
[0017] A user utilizes a computer 102 in communication with the
recruiting service 100 via the Internet. The user's computer 102
displays a graphical user interface generated by the recruiting
service 100. A user searching for candidates to fill a software
development position accesses the recruiting service 100 to input a
query 130 defining an initial candidate specification, such as
proficiency in one or more programming languages. Other examples of
a candidate specification include a geographical area
specification. In response, the graphical user interface generated
by the recruiting service provides a listing of profiles of
potential candidates as illustrated by arrow 140, which may also be
presented in a ranked order. The user can then request more
detailed profile information for individual candidates. An
exemplary set of profile information includes the number of years
and relative ranking of the candidate in different programming
languages, an influence score, overall experience level, a summary
of programming projects and links to the projects, a summary of
employment history, and social media information. A messaging link
is preferably provided to permit the candidate to be contacted
either directly via email (e.g., via either anonymous or
non-anonymous email) or by other contact modalities (e.g.,
messaging, phone, etc.).
[0018] FIG. 2 is a screenshot of an exemplary graphical user
interface. A search field 205, permits a user to enter queries
based on skills. For example, in one embodiment a user may input
language skills and any Boolean logic operators (e.g., AND or OR)
to define a skill portion of the query. A location search field 210
permits the query to be limited by geographical area and a name
field 215 permits the query to be limited by name of the developer.
Additionally, it is contemplated that other search fields could be
included, if desired, to focus a search.
[0019] In this example, a skill query based on "Java" skills is
input into search field 205. A search button 207 permits the search
to be triggered. This results in an initial listing 220 of profiles
225. In one implementation the profiles are sorted and ranked by
overall knowledge. Other profile information may be displayed in
the initial listing such as the developer's name 227, photo 229 (if
available), brief summary of employment history 231 (if available),
and ranked scoring 233 in different programming languages including
those in the query and other selected languages for the profile.
Thus in this example, the ranked scoring includes the Java language
ranking first (because the query was for Java) along with other top
scores. The user interface may also provide an indication of the
ranking in terms of the top rankings (e.g., through a set of top
rankings, such as top 10%, 20%, or 30%) via a tab other visual
indicator. Thus, the user can quickly search for profiles in the
initial listing corresponding to developers that are knowledgeable
and skilled in a language of interest.
[0020] The graphical user interface permits a user to select an
individual profile and then displays detailed profile information
for the individual profile. FIG. 3 illustrates a first portion of
an individual profile for a developer. The profile may include the
person's name 227, photo 229 (if available), a brief summary of
code analysis 305, full results of code analysis 310, knowledge
ranking 315 (e.g., a number from 0 to 100), overall experience
level 320 (estimated number of years of experience), influence
score 325 (e.g., a number from zero to five), and a messaging link
330.
[0021] The code analysis 310 is based on analyzing code from code
repositories to provide objective information regarding a minimum
number of years of experience in a particular programming language
as well as an objective analysis of the code itself to providing a
ranking of the developer's skills. The full analysis includes the
estimated number of years of experience with each language. Tabs
are provided indicating top rankings (e.g., through a set useful to
the end-user, such as top 10%, top 20%, top 30%, etc.). The number
of views by others and the adoption of code by others may be used
to generate the influence score 325 as a measure of how influential
the programmer is.
[0022] FIG. 3B illustrates a second portion of the profile for the
developer. The employment history 410 of the developer is
summarized when it is available. For example, such employment
information is sometimes (but not always) posted on public websites
such as LinkedIn. Social profile information 420 is provided, which
may include links 425 to social media websites that the developer
uses.
[0023] A bio summary 430 may be extracted from social media.
Alternatively, in one embodiment, a software developer is permitted
to check their profile and take ownership of their profile in the
sense of providing some limited voluntary inputs, such as bio
summary information, and also provide feedback on any errors.
[0024] A summary of projects 440 accessible in code repositories is
also provided. The summary preferably also includes links to the
code in the repository for each project 445 for users interested in
performing a more detailed analysis of the code itself.
Additionally, information about the project may be included such as
file size in terms of number of lines of code, number of views by
other developers, and number of collaborators.
[0025] FIG. 5 illustrates in more detail a functional block diagram
of a recruiting service 500 in accordance with an embodiment of the
present invention. The recruiting service 500 may reside on one or
more servers with associated processors and memory, wherein the
computer code is stored on a computer readable memory. A database
memory may be provided to store information for the recruiting
service, including candidate profile information.
[0026] A crawler 505 is provided to crawl code repositories. For
example, the crawler may use an API for code hosting sites such as
GitHub. A new candidate profile generation module 510 determines
whether the crawler has identified a new developer. If so, a
profile ID is generated to build a new profile. A code file type
analysis module 515 determined the file type of files being
crawled. After the file type has been determined, the
language-specific code analysis module is selected by module 520.
Scoring and cheating detection is then performed by module 525.
Profiles are stored in a profile information database 530. A social
media access module 535 provides access to social media information
sites and a social media aggregation module 540 correlates
aggregated social media information for individual profiles. A
messaging interface 545 is included in one embodiment as a means
for recruiters to contact individual developers. However it will be
understood the messaging interface 545 may be omitted in some
implementations. The messaging may, for example, be brokered in the
sense of cloaking the user information and email address of the
recruiter during initial attempts to contact a developer. A
recruiting search engine and graphical user interface module 550 is
responsible for generating the graphical user interface that is
provided for display on a user's computer.
[0027] The new candidate profile generation module 510 utilizes
author information from crawled sites to detect that there is a new
developer to be added to the system. Code repository sites include
author information for each project. This author information is
searched by the crawler. Each individual person with a profile has
a unique ID. The unique ID is created the first time an individual
programmer's name is discovered in crawling author information in
code hosting sites. For example, when the crawler finds the names
of people that have contributed code to a code hosting site, the
system compares the unique ID from the network that the person is
found on to the unique IDs in the database of the recruiting
service for that network. If an ID doesn't exist, a new user ID is
created.
[0028] The crawling process generated project information for each
developer. One way to obtain project information for a particular
person is specifically ask a code hosting site (or content site
like Stack Overflow) for a list of projects for each developer. For
example, this may be done through an API for sites such as GitHub
and Stack Overflow.
[0029] The crawling is updated regularly and the profiles are
refreshed according to a cycle. An exemplary refresh cycle is a
two-week profile refresh cycle. That is to say, the update from the
crawlers may be constant, but the profiles may be updated according
to a schedule, such as every two weeks.
[0030] One aspect of the crawling process is that the source code
for a particular project is downloaded for analysis. As
illustrative examples, the source may be downloaded using
technologies such as Git, SVN, Mercury, and CVS, which are
technologies that allow for synchronization with the local computer
of a code repository.
[0031] It is preferable to download all of the available
information in a repository for analysis. However, note that the
source code for a project may be in any one of a variety of
different file types.
[0032] Downloaded files are then processed, starting first with the
code file type analysis module 515. An individual file is analyzed
to determine what's in it by looking at file extensions and the
binary data or text that the file contains. Specific patterns in
the source code are analyzed. For example, specific languages--like
Ruby--always start with a certain few lines of code--e.g. the
hashbang--so that by looking for the patterns of a specific set of
keywords in the file permits the language to be identified.
Additionally, the analysis of the code can include looking for the
"magic number"--a set of bytes at the beginning of the file that
indicates file type. For example, images always start with a
specific byte configuration. The pattern associated with each
different file type is checked until a match is found. The pattern
matching may be performed, for example, using a sequence of if-then
clauses to identify the file type of a particular file.
[0033] After the file type is determined, the language specific
code analysis selection module 520 makes a selection of an
evaluation tool or tools appropriate for the language of the file
type. Most software languages have evaluation tools to evaluate the
quality and complexity of the coding. The evaluation tools are
specific to a particular language and may, for example, look at the
length of the code and patterns in the code. For many cases, the
evaluation tools for a specific language are open source and/or
available from commercial vendors. For example, there is a unique
set of tools to evaluate Ruby--tools that differ from those used to
evaluate C++. Thus if the recruiting service is designed to analyze
code in languages such as Java, Scala, Shell, ActionScript, XML,
CSS, HTML, Groovy, PHP, Perl, Python, Lisp, etc. then the system
includes the corresponding evaluation tool for each supported
language. Thus, the recruiting service includes a wide range of
evaluation tools to support different languages and makes the
selection of the proper evaluation tool based on the file type. The
file is then analyzed using the appropriate selected tool(s) for
the language associated with the file type.
[0034] The scoring and cheating detection module 525 utilizes the
evaluation of the code and also information from the commit log for
the file. To identify the author of the code of a particular file,
the commit log is evaluated for the repository. The commit log is a
list of who did what for the repository. This permits an evaluation
of the developer's specific contribution(s) to that project. For
example, by analyzing the commit log an evaluation can be made of
the time(s) when the developer made a contribution.
[0035] The process is continued for all of the developer's
repositories to permit a determination to be made of what languages
that a person has experience in and how much experience the person
has in each language. Recruiting service downloads all the
developer's repositories and evaluates their contributions to
determine the languages they've written code in. To evaluate
experience the commit log is examined to look at the date and time
of when the developer contributed to the project. Different factors
can be used to determine actual experience. One factor is that
contributions can be evaluated by their frequency and regularity to
weight the actual number of years of experience in a particular
language.
[0036] As an illustrative example, consider a developer named Joe.
If Joe started contributing to a project 3 years ago, and the
commit log shows that he's been contributing regularly to it,
that's an indication that he has 3 years of actual experience.
Thus, a weighting function can take into account the frequency and
regularity of Joe's contributions. For example, if Joe has been
making four or more contributions per year that's an indication
that Joe has been regularly working on Java.
[0037] However, if Joe made a single contribution 3 years ago and
just contributed again for the first time 2 months ago, then the
commit log indicates "episodic" contributions with a wide spacing
between contributions. For this second case, the weighting factor
can be used to reduce Joe's number of years of experience such that
he does not get 3 years of experience credit. The exact weighting
function chosen can be empirically determined based on common
behavior patterns of software developers. For example, if the
commit log shows Joe made a single contribution to a Java project 3
years ago and made a second smaller contribution a month ago there
could be a possibility that Joe is either 1) trying to "inflate"
his resume about the number of years of experience he has in Java;
or 2) may have become aware of the recruiting service and is
intentionally trying to trick the recruiting service. In this
example, the weighting function may also include one or more rules
to discount recent contributions, particularly those of a minor
character, such as a minor code tweak or a contribution made with
many other contributors.
[0038] Thus, while the raw data provides an indication of a maximum
potential number of years of experience, a weighting function may
include different factors related to frequency of contribution,
size of contribution, and number of co-contributors to perform a
weighting function to arrive at a more accurate interpretation of
the number of years of experience for a developer. The weighting
function may be determined empirically, based on observations about
the way software developers normally work, to optimize different
weighting factors and periodically adjusted to discourage gaming of
the system. Other types of gaming (such as posting the same code at
different times on different sites or plagiarizing code from
others) could also, in theory, be checked as part of a larger fraud
detection function.
[0039] As an illustrative example, patterns in a commit log may be
examined for suspicious factors of how the developer is developing
his/her source code. The simplest example of a developer cheating
is that the developer downloads someone else's source code, opens
their own repository, and submits that same exact code to the new
repository. In that case, there would be a huge update all at once,
and then nothing else in terms of activity. This is inconsistent
with normal commit log behavior in which a user normally makes a
series of regular contributions over time. An honest developer
would normally (except for perhaps extremely small projects) be
consistently committing code they are developing for their project.
As a result when a huge aberrant spike occurs in a commit log a
presumption can be made that there is a high likelihood that
cheating has occurred. In this case, the weighting function can
severely or totally discount the project, i.e., give it extremely
little or no credit.
[0040] As previously described, in one embodiment there are three
kinds of scores that are calculated for each developer. These
include knowledge, experience and influence. This level of scoring
provides a variety of useful information to evaluate candidates.
However it will be understand that the recruiting service could
also be implemented with a subset of this set of scores.
[0041] The scores are preferably calculated on a language-specific
basis and an overall basis. Language specific scores are useful to
evaluate skills in a particular language. However, generating an
overall score provides an additional indicator of a developer's
talent.
[0042] An exemplary language-specific scoring process will now be
described. In one implementation, to determine a knowledge score,
an examination is made of lines of code and the number of
repositories that the developer has contributed to. The score is
then calculated by a function that weights the total number of
lines of code in all of the different repositories. That is, a
developer who has written more lines of code has more experience
and credit is given for contributing to different repositories.
However, the number of lines of code can reach very large numbers.
Thus, one way to score knowledge is apply a logarithm function
based on the number of lines of code. As one example, a knowledge
score for a developer can be generated using a natural log curve:
ln(lines).times.number of repositories, where ".times." is the
multiplication operation and this equation is a simplified equation
to illustrate a general approach that one of ordinary skill in the
art would further optimize for a particular implementation to
optimize empirical results. Other variations based on a logarithm
function are also possible and other factors could be included in
determining a knowledge score.
[0043] An exemplary language-specific scoring process for
experience looks at different factors indicative of experience and
then weights the factors. For example, to calculate a total
experience score the individual experience scores may be combined
with work experience, i.e., Total Experience=Individual experience
skill experience scores.times.work experience. Of course, many
variations are possible in terms of weighting individual skills
experience with work experience. In one implementation the process
looks at the lines of code that have been written with particular
attention to the lines of code written per day, in addition to
commits per day, and the number of days of activity. A weighted
function SUM(r) can be used to perform an initial analysis of
experience in different skills. The scoring can be further weighted
by work experience, resulting in simplified equation to determine a
language-specific experience score: SUM(r).times.Work Experience,
where this equation is a simplified equation to illustrate a
general approach that one of ordinary skill in the art would
further optimize for a particular implementation to optimize
empirical results.
[0044] An overall score can be calculating by weighting individual
scores. An overall experience score can be determined as
follows:
Experience=Individual experience skill experience scores.times.work
experience.
[0045] An overall knowledge score can be determined using different
weighting approaches. In one approach a logarithm function is used
to weight the sum of different knowledge skill scores so that a
high score requires the developer to have a wide variety of
skills:
[0046] Knowledge=ln(SUM(Skill Knowledge)), where this equation is a
simplified equation to illustrate a general approach that one of
ordinary skill in the art would further optimize for a particular
implementation to optimize empirical results.
[0047] The influence score is a measure of the developer's
influence in the larger developer community. In one embodiment the
influence score includes how much the developer's code influences
other developers. Additionally the influence score may include the
developer's influence in social media. For example, the influence
score may include a component based on how a developer's projects
have influenced others, based for example on the number of
followers, forks, and contributors, which may be determined from
data within code hosting repositories. However, an individual
developer may have different influence in different languages,
which has to be taken into account in determining an overall
influence score. Additionally, the developer's influence in social
media may also be considered, such as weighting the influence in
social media by a weighting function. For example, one measure of
influence in social media is a Klout score. Thus an exemplary
overall influence score may be determined as follows:
Influence=f(Klout)+ln(SUM(Skill Influence))
[0048] where a developer is given credit for influence in different
skills may also be given some credit for social media influence and
where this equation is a simplified equation to illustrate a
general approach that one of ordinary skill in the art would
further optimize for a particular implementation to optimize
empirical results.
[0049] The scoring and weighting functions that are applied are
determined empirically to give a desired distribution based, for
example by examining what weighting functions give the best
real-world results at a particular point in time for recruiters.
Thus, for example, the actual constants used as weighting factors
and aspects of the weighting functions may be varied based on
feedback on the usefulness of the scoring for simulated or actual
recruiting efforts.
[0050] For example, when calculating knowledge, one approach is to
look for extensive experience in several languages. This is because
in the real world highly knowledgeable developers have a broad
range of experiences to draw upon and are skilled in different
languages. Thus, even if a developer has top 10% scores in one or
two languages, they cannot get a top 10% overall knowledge score,
because that only happens when the developer has top-tier scores in
several languages. That is to say, breadth counts.
[0051] The public code repositories are crawled on a regular basis.
In one embodiment all the raw data obtained from a crawl of the
public code repositories is saved, except for the source code. That
is, it is preferable to save the information obtained by analyzing
the code, in addition to the source log itself. The next time the
crawler encounters that repository, the source code is downloaded
again, and a "refresh" is made based on new contributions.
Additionally, the logs are checked to determine the individual(s)
that made the new contribution. Thus, if Michael has a repository
with a project, the system will also confirm from the log entries
who made any new contributions. Thus if Luca makes a follow-on
contribution to Michael's project, the follow-on contribution will
be credited to Luca. This cross-checking of which individual made
which new contribution to a project is useful to improve accuracy
and reliability of the scoring.
[0052] The social media access module 535 and the social media
aggregation module 540 provide a comprehensive set of social media
links for each profile. The author information obtained from public
repository sites such as GitHub and Stack Overflow may be
incomplete or contain inaccuracies. However typically the author
information will include at least an email address and perhaps also
a name. This information can then be used to obtain additional
social media information using commercial services such as Full
Contact, Inc. of Denver, Colo., Fliptop, Inc. of San Francisco,
Calif. and Rap Leaf of San Francisco, Calif. Many commercial
services check by unique information, like email address, or a hash
of the email address (a hash is a unique number generated by an
email address. That way, companies can match users by email
addresses, but protect their privacy by looking at hash numbers).
In one embodiment a search of social media sites is performed of
all of the sites listed under Full Contact's set of Social Network
Types. From this information profile information identifying the
names of developers may be generated along with associated
information. For example, work history may also be scraped from
social networking sites.
[0053] Direct scanning of social media sites is also an option,
such as the option of scanning sites such as LinkedIn and Google
Plus. However, there's usually not a one-to-one results process.
For example, if a developer has a common name, such as "John
Smith," a scan based on their name may turn up more than one hit.
To find additional social media links for a particular profile it
is thus desirable to look for multiple matching factors (location,
title, company, name, etc.), and then calculate the probability
that it's a match. If the probability is higher than a certain
number, the system automatically merges the profiles. If the
probability is less than that threshold, the system sends a
notification that there needs to be a manual review process.
[0054] Once links to social media are identified for a developer
they can be refreshed at a rate slower than other information in
the public code repositories. Individuals typically add new social
networks infrequently and the URLs of social media sites are
generally static.
[0055] The graphical user interface discussed in this application
includes a set of features that are useful in making recruiting
decisions. However, it will be understood that subsets of these
features may be used. That is, one of ordinary skill in the art
would understand that variations in the graphical user interface to
include variations of what has been described are possible.
[0056] It will also be understood that the scoring techniques
described are exemplary. As software evaluation tools increase in
their capabilities it will also be understood that other metrics of
coding quality and/or complexity could be utilized.
[0057] While the invention has been described in conjunction with
specific embodiments, it will be understood that it is not intended
to limit the invention to the described embodiments. On the
contrary, it is intended to cover alternatives, modifications, and
equivalents as may be included within the spirit and scope of the
invention as defined by the appended claims. The present invention
may be practiced without some or all of these specific details. In
addition, well known features may not have been described in detail
to avoid unnecessarily obscuring the invention.
[0058] In accordance with the present invention, the components,
process steps, and/or data structures may be implemented using
various types of operating systems, programming languages,
computing platforms, computer programs, and/or general purpose
machines. Methods and graphical user interfaces of the present
invention may also be tangibly embodied as a set of computer
instructions stored on a computer readable medium, such as a memory
device.
* * * * *