Wikidata March 2014 code review

Progress since the 2013 ECR and a look to the future


Created by Jeroen De Dauw for Wikimedia Deutschland
Licensed CC BY-SA 3.0


http://bit.ly/wikidata-cr-march-2014

Some fun Wikibase.git stats

Y U NO LOAD

NCLOC - lines of code

58k 95k

Classes

Class count: 417 602

13% average class NCLOC increase

100 NCLOC → 112 NCLOC

Abstract classes

10% 6%

Methods

Static: 11.3% 8.7%

Public: 74% 68%

NCLOC: 16 17

Cyclomatic complexity/methods: 2.12 2.02

2013 ECR: frequent problems

Static scoping

bit.ly/static-code


Cause: MediaWiki hooks, misplaced code


Trend: Slowly decreasing, topic understood

Lack of lifecycle control

Legacy interfaces:
API, SpecialPage, Action, Job, ContentHandler, Maintenance...


Trend:
Some new introductions, some partial cleanup
Not much change in technical debt


Solution:
Proper DI and disciplined handling of legacy APIs

SRP violations


Trend:
More awareness, though perhaps not enough
New issues still being introduced

Inheritance abuse

Inheritance for code reuse:
API, SpecialPage, Entity, test cases...


Problems:
Complexity, fragility, opacity, immobility, SRP, LSP


Encouraged by lack of lifecycle control

2013 ECR: further issues

Singleton

Initial state:
Few, deprecated


Progress:
No new ones added, some removed
Only a few to go

Namespaces

PSR-0 was suggested

We now use PSR-0 or PSR-4

Except in Wikibase.git and part of Diff

Dependency injection

Libraries use DI (+factories)


Repo and Client factories

  • Abused for non-construction
  • Splits and composition did not occur
  • Excessively bound to, not being fixed consistently


DI approach from ECR implemented in WB Query

Serialization

Suggested changes:
Decoupling from value objects


Progress:
Implemented nicely for Ask
Now migrating remaining code

Concrete PHP

DispatchChanges

ECR

  • Bad function level code (4 levels of nesting)
  • Very high complexity
  • Many non-public responsibilities


Now

  • No real changes
  • Clearly does way to much
  • Own app, should be moved out of Lib

ChangeHandler

ECR

  • High complexity
  • Should be split
  • Singleton
  • Doing stuff in constructor


Now

  • No real changes
  • Does not belong in Lib, move to Repo

Api\EditEntity::modifyEntity

Guess the NPath complexity!
17,336,096 (way over 9000)


Many function level issues

Method no longer critical, class is though

GeoCoordinateParser

ECR: Possible SPR violation, split suggested


Status: done

MapDiffer

ECR: Highest complexity, good tests recommended


Status: complexity reduced via method extraction

ListDiffer

ECR suggested using the strategy pattern


Status: done

Action items

Implement DI → partial

Refactor EditEntity (API) using DI → fail

Refactor WB hooks using DI → fail

Y U NO LOAD

Other progress made

  • Extracted DataModel from Lib
  • Decreased state in Lib
  • Decreased Lib binding to Repo and Client
  • Components
  • Dependency management

More progress

  • More and better tests
  • Better CI
  • Code quality metrics and code coverage

New action items

  • Finish implementing DI
  • Aggressively phase out Lib
  • Create boundaries in Wikibase.git
Y U NO LOAD