## rumus Apel Vs Jeruk

$T$= Type / ClassTe
$\mathcal{T} = \{ T | \exists(s,w3type T) \in D\}$Te

$I$ = instance Te
$I(T,D)$= set of instance with type $T$ in dataset $D$Te
$I(T,D) = \{ s | \exists(s,w3type T) \in D\}$Te

$P(T)$ = set of distinct property in Type $T$Te
$P(T) = \{ p | s \in I(T,D) and \exists(s,p,o) \in D\}$Te

$OC(p,I(T,D)$ = occurrence property $p$ in $I(T,D$Te
$OC(p,I(T,D) = | \{ s | s \in I(T,D) and \exists(s,p,o) \in D\} |$Te

Coverage
$CV(T,D) = \frac{\sum_{p \in P(T) OC(p,I(T,D)}}{|P(T)| \times |I(T,D)| }$Te

Weight
$WT(CV(T,D)) = \frac{|P(T)| \times |I(T,D)|}{\sum_{T' \in \mathcal{T}} |P(T')|+|I(T',D)}$Te

Coherence
$CH(\mathcal{T},D) = \sum_{T \in \mathcal{T}} WT(CV(T,D)) \times CV(T,D)$Te

$D$ = real datasetTe
$D'$ = new dataset after removing coinTe
$|D| < |D'|$ and $D \subset D’$Te

what is coin ?
removing a set of triples with the same subject and propertyTe

$\mathcal{T}(s)= \{Ts^1,...,Ts^n\}$Te

$A1 \Longrightarrow$ We do not completely remove property $p$ from any of the types {$Ts^1$,…,$Ts^n$ }. That is, after the removal, for each type there will exist instances that have property $p$.Te

$A2 \Longrightarrow$ We do not completely remove instance $s$ from the dataset. This can be very easily enforced by keeping the triples {$s$, rdf:type, $Ts^i$ } in the dataset.Te

Weight is the same after removing the triples
but the coverage is changed : Te
$CV(T,D)' = \frac{\sum_{q \in P(T)-p} OC(q,I(T,D)) + OC(p,I(T,D) -1) }{|P(T)| \times |I(T,D)|}$Te

$CH(\mathcal{T}, D')= \mathcal{T}$Te
$|D'| = \sigma$Te

$coin(\mathcal{T}(s),p)=CH(\mathcal{T},D) - CH(\mathcal{T},D)'$Te

$|coin(S,p)|$= number of subjects that are instance of all the types in $S$ and have at least one triple with property $p$Te
$|coin(S,p)|=|\{s \in \bigcap_{T \in S} I(T,D)| \exists(s,p,v) \in D \}|$Te

$C1 \Longrightarrow$ the amount by which we decrease coherence (by removing coins) should be less than or equal than the amount we need to remove to get from $CH(\tau, D)$ (the coherence of the original dataset) to $\lambda$ (the desired coherence).Te

$X(S,p)$ the integer programming variable representing the number of coins to remove for each type of coin.Te
$\tau =$ number of types
$\pi =$ the number of properties in the dataset
worst case the number of variables for D can be $2^\tau \pi$Te

sets x and y, x $\subseteq$ y if all elements of x are also elements of y
Te
$C1 \Longrightarrow \sum_{S \subseteq \mathcal{T},p} coin (S,p) \times X(S,p) \leq CH(\mathcal{T},D) - \lambda$ Te

$M \Longrightarrow$ the amount by which we decrease coherence should be maximized.

$M \Longrightarrow MAXIMIZE \sum_{S \subseteq D,p} coin (S,p) \times X(S,p)$ Te

$C2 \Longrightarrow \forall S \subseteq ,p 0 \leq X(S,p) \leq |coin(S,p)|-1$Te

$ct(S,p) =$average number of triples per coin typeTe
$C3 \Longrightarrow (1-\rho) \times (|D| - \sigma ) \leq \sum_{S \subseteq \mathcal{T},p } X(S,p) \times ct(S,p)$Te

$latexC4 \Longrightarrow \sum_{S \subseteq \mathcal{T},p } X(S,p) \times ct(S,p) \leq (1+ \rho ) \times (|D| – \sigma )$

Iklan
1. wah, ilmu baru neh bagi saya
bener-bener baru…

2. Waw..hanya orang berilmu yang bisa membaca kode-kode itu.
Saya tidak bisa menulis rumus apel vs jeruk, tapi kalo mangga dan lengkeng dikawin sama durian, bisa….

3. pusing dah …kalu dah ngomongin ttg rumus ..
hehe