SESSION#1

  1. two types protocols
    o on-chip : APB, AHB, AXI, ACE, CHI
    o peripheral : SPI, I2C, UART, PCIe, USB, SATA
  2. AXI
    o Chip : RISC
    o ARM : AXI or AHB
  3. Various phases of protocol(~set of rules)
    o arbitration phase (only in AHB)
    o address phase
    o data phase
    o response phase
  4. Analogy
    o Bank :
    o there are different kinds of requiremnets
    o cash deposit => APB (bytes moving from one component to anotehr)
    o home loan => AXI
    o set of rules?
    o APB
    o psel=1, peanble=1 and wait for pready==1 => all should happen at +edge of clock.
  5. OCP
    Open core protocol
  6. What is the maximum frequency of AXI protocol?
    o theoretically infinite.
    o assumption: if combinational logic has no delay, clk-q delay is 0, setup time is 0
    o All these on-chip protocols are limited by the timing closure.
    o STA : Setup time and hold time analysis.
    o Frequency increased => Time period decreases => Meeting setup time becomes difficult
  7. on-chip protocols
    o any number of pins peripheral protocols:
    o they can work at very high frequency.
  8. AXI gives more performance?
    o less time, more data transfer.
    o Ex:
    o AXI takes 26 clock cycles to transfer, 100 bytes of data.
    o 1 (addr phase) + 25(data phases) = 26
    o 1 address phase and 25 data phases happen
    o AXI slave are more intelligent than AHB slaves.
    o AHB also takes 26 clock cycles to transfer, 100 bytes of data. => it consumes more power.
    o 1 (addr phase) + 25(data phases) = 26
    o 25 address phases and 25 data phases happen
    o APB also takes 50 clock cycles to transfer, 100 bytes of data.
    o 100 bytes = 25 transfers
    o each transfer = 2 clocks
    o total = 50 clock cycles o what are the advantages of reducing number of address phases in AXI?
    o Reduction of dynamic power consumption
    o same address bus now can be used to some other transaction address phase
  9. ID is used to identify the transactions
    o Home loan
    o submitting application ==> address phase in AXI
    o submitting documents ==> Multiple data phases in AXI
    o sanction ==> respnse phase in AXI
    o every home loan has an application number.
    o benefits?
    – who is taking
    – when any document is given or received, it happens with application number
    – when loan sanction happens, it is on application number
    o due to this application number, bank will be able to handle many home loans at same time.
    o If bank uses 3 digit application number => 999 loans can be handled concurrently.
    o AXI uses 4 bit ID = 16 concurrent transactions are possible in AXI.
    o AHB doesn’t use ID = 1 transaction possible at any time
    o APB doesn’t use ID = 1 transaction possible at any time
  10. How AXI based system architecture looks like?

11.
Why we have limited AXI ID to 4 bits? can we increase?
– it can be increased.
How to decide about time that for what is the time limit ID should not be given and above what time ID should be given?
– If one tx needs to be done at any given time => ID is not required
– If multiple tx needs to be done at any given time => ID is required

  1. AXI
    o write transactions = 1 write address phase + Multiple Write data phases + 1 write response phase
    o 3 channels
    o write address channel (M->S) aw
    o write data channel (M->S) w
    o write response channel(S->M) b
    o read transactions = 1 read address phase + Multiple read data phases
    o 2 channels
    o read address channel (M->S) ar
    o read data channel (S->M) r
    o All these channels have few common signals.
    o Handshaking signals, ID
    o valid, ready
    o write address channel common signals
    o awvalid, awready, awid
    o write data channel common signals
    o wvalid, wready, wid
    o write response channel common signals
    o bvalid, bready, bid
  2. burst transfer, atomic transfer
    o atomic transfer
    o 1 address phase + 1 data phase
  3. address and control information
    o common signals: awvalid, awready, awid
    o address information => master uses awaddr signal
    o what all additional details master wants to convey to the slave?
    o type of transfer? awburst
    o incremeting, fixed, wrapping address.
    o how many beats will be there? (Burst length)
    o awlen
    o what is the ID?
    o awid
    o how many bytes are transferred per beat? (burst_size)
    o awsize
    o lock
    o awlock, awcache
    o write address channel signals
    o awvalid, awready, awid
    o awaddr, awburst, awlen, awsize, awlock, awprot, awcache
  4. write data channel
    awsize = 1 => 2awsize = 21 = 2 bytes per beat
    o wvalid, wready, wid
    o wdata, wstrb, wlast
    o traffic example
    o wdata size = 64 bits
    o wstrb = 8 bits (one hot encoding)
    wstrb = 8’b1100_0000
    wdata[63:56] & wdata[55:48] are valid data.
    wstrb = 8’b0000_0011
    wdata[7:0] & wdata[15:8] are valid data.
  5. write respnse channel
    o bvalid, bready, bid
    o bresp
  6. read address channel
    o arvalid, arready, arid
    o araddr, arlen, arsize, arburst, arcache, arlock, arprot
  7. read data channel
    o rvalid, rready, rid
    o rdata, rresp, rlast
    o Read channels doesn’t use strobe, there is an automatic mechansim through which master knows the valid positions of data.

SESSION#2

revision:

  1. AXI
    • on-chip protocol
    • supports most of the features
    • AXI is implemented using 5 different channels
      o all the sginals in these channels
    • How AXI gives better performance
    • AXI signals => their signifnace
    • importnace of ID signal

Notes:

  1. AXI Write Transaction example:
    awaddr = 32’h1000_0000
    awlen = 5
    burst lenght(how many beats) = 5+1 = 6
    awburst = INCR
    o data writes happen to incrementing address locations.
    awsize = 2
    o Number of bytes per beat = 2**2 = 4

total bytes transferred in this Transaction = 6*4 = 24
Since we are ‘INCR’ type of burst
o all the bytes stored in to incrementing locations
o starting address = 32’h1000_0000
o last address(where 24th byte is stored to) = 32’h1000_0017 (‘h17 = 23 in decimal)

1st beat happens to:
addr = 32’h1000_0000
wdata = 32’h12345678
Assuming Little endian type of architecture.
78 stored in to 32’h1000_0000
56 stored in to 32’h1000_0001
34 stored in to 32’h1000_0002
12 stored in to 32’h1000_0003

2nd beat happens to: addr = 32’h1000_0004
wdata = 32’hFFEEDDCC
Assuming Little endian type of architecture.
CC stored in to 32’h1000_0004
DD stored in to 32’h1000_0005
EE stored in to 32’h1000_0006
FF stored in to 32’h1000_0007

3rd beat happens to: addr = 32’h1000_0008
wdata = 32’h10203040
Assuming Little endian type of architecture.
10 stored in to 32’h1000_000B
20 stored in to 32’h1000_000A
30 stored in to 32’h1000_0009
40 stored in to 32’h1000_0008

4th beat happens to: addr = 32’h1000_000C
wdata = 32’h11223344
Assuming Little endian type of architecture.
44 stored in to 32’h1000_000C
33 stored in to 32’h1000_000D
22 stored in to 32’h1000_000E
11 stored in to 32’h1000_000F

5th beat happens to: addr = 32’h1000_0010
wdata = 32’h11223344
Assuming Little endian type of architecture.
44 stored in to 32’h1000_0010
33 stored in to 32’h1000_0011
22 stored in to 32’h1000_0012
11 stored in to 32’h1000_0013

6th beat happens to: addr = 32’h1000_0014
wdata = 32’h11223344
Assuming Little endian type of architecture.
44 stored in to 32’h1000_0014
33 stored in to 32’h1000_0015
22 stored in to 32’h1000_0016
11 stored in to 32’h1000_0017
will there be 7th beat? No

2.
AXI Read Transaction example:
Araddr = 32’h1000_F000
Arlen = 4
beats = 5
Arburst = Fixed
Arsize = 2
burst size = 4

tx size = 20 (~’h14)
starting_addr = 32’h1000_F000

FIxed type of burst
o all the reads will happen to 32’h1000_F000 address only
o address incementing won’t happen.

  1. If the slave is FIFO, we go for FIXED burst type (since I need to read from same location)
  2. If the slave is memory, we go for INCR burst type(since I need to read different continous addresses)
  3. totally AXI support 3 types of Bursts
    o Fixed, Incr, Wrap
  4. Wrap type of burst

32’h100 = 256 decimal
tx size = 24
remainder = 256%24 = 16
Lower boundary = 256 – 16 = 240 = 32’hF0
Upper boundary = 32’hF0 + ‘h17 = 32’h107
32’h100 falls in to 32’hF0 to 32’h107 range.

  1. 32’h1000_0010 which boundary it falls into
  2. While working with Cache memories, Wrap kind of transfers are used.

9
AXI Read Transaction example:
araddr = 32’h1000_0010
arlen = 7
arburst = Wrap
arsize = 3 (per beat 8 bytes)

tx_size = 64
boundary =
decimal = 268435472%64 = 16 (0.25 maps to 16)
wrap lower addr = 32’h1000_0010 – ‘h10 = 32’h1000_0000
wrap upper addr = 32’h1000_0000 + 63 = 32’h1000_003F

1st beat happens to 32’h1000_0010
2nd beat happens to 32’h1000_0018
3rd beat happens to 32’h1000_0020
4th beat happens to 32’h1000_0028
5th beat happens to 32’h1000_0030
6th beat happens to 32’h1000_0038 => 3F => here we reach upper boundary
7th beat happens to 32’h1000_0000
8th beat happens to 32’h1000_0008 => 0F

10
AXI Read Transaction example:
araddr = 32’h1000_0010
arlen = 7
arburst = Incr
arsize = 3 (per beat 8 bytes)

tx_size = 64
boundary =
decimal = 268435472%64 = 16 (0.25 maps to 16)
wrap lower addr = 32’h1000_0010 – ‘h10 = 32’h1000_0000
wrap upper addr = 32’h1000_0000 + 63 = 32’h1000_003F

1st beat happens to 32’h1000_0010
2nd beat happens to 32’h1000_0018
3rd beat happens to 32’h1000_0020
4th beat happens to 32’h1000_0028
5th beat happens to 32’h1000_0030
6th beat happens to 32’h1000_0038
7th beat happens to 32’h1000_0040 => INCR differs
8th beat happens to 32’h1000_0048 => INCR differs

  1. narrow trnasfer
    100, 102
    3rd => 104
    locations = 104, 105
    wdata = 64’h10203040_50607080
    ‘h104%8 = 4 (4th bytes onwards is considered)
    @104 => 40
    @105 => 30

4th beat => 106
locations = 106, 107
wdata = 64’h10203040_50607080
‘h106%8 = 6
@106 => 20
@107 => 10

5th beat => 108
locations = 108, 109
wdata = 64’h10203040_50607080
‘h108%8 = 0
@108 => 80
@109 => 70
—- it will stop here—-

11.
addr = 32’h100
1st beat
wdata = 64’h10203040_50607080
locations = 100, 101, 102, 103
@100 => 80
@101 => 70
@102 => 60
@103 => 50

2nd beat @ 32’h104
wdata = 64’h10203040_50607080
locations = 104, 105, 106, 107
‘h104/ = ‘h104/8 = 4
@104 => 40
@105 => 30
@106 => 20
@107 => 10

3rd beat @ 32’h108
wdata = 64’h10203040_50607080
locations = 108, 109, 10A, 10B
‘h108/ = ‘h108/8 = 0
@108 => 80
@109 => 70
@10A => 60
@10B => 50
so on

  1. HOMEWORK
    awaddr = 32’h100
    awlen = 4, awburst = Wrap(NOTE THIS)
    awsize = 2
    Wdata bus width is 64 bits.
    Wrap boundary : 240 to 259 (32’hF0 to 32’h103)

13.
awaddr = 32’h100
awlen = 4, awburst = incr
awsize = 2
Wdata bus width is 64 bits.

SESSION#3

  1. how to analyze AXI tx
    • wrap
    • incr
    • fixed
  1. real time example of narrow transfer
    processor <—–> memory
    data bus size = 64 bits
    but processor can only process 8 bit at a time.
    o it will read 1 byte at a time => narrow transfer.
  2. narrow transfer don’t have any benefit
    o one option where master can read only the number of bytes it needs.

Notes:

  1. aligned and unaligned transfers
    o if addr is divisble by burst_size(bytes_per_beat), then it is aligned transfer, else unaligned
  2. Little endian architecture
    o lower bytes written to lower address
    o upper bytes written to upper address
  3. Big endian architecture
    o lower bytes written to upper address
    o upper bytes written to lower address
  4. who gives response?
    o Slave
    o Slave is indicating its status.
    o
    o AXI supports 4 types of responses
    o OKAY
    o if master did normal access, response is OKAY => it indicates the success
    o if master did exclusive access, response is OKAY => it indicates the failure of the access.
    o EXOKAY
    o SLVERR
    o DECERR
  5. exclusive access
    o slave is exclusively reserved for a master(for particular period of time), not for whole application
    o during this time if some other master accesses the same slave, then exclusivity is lost.
    o to indicate that exclusivity is lost, SLave provides OKAY response to the 1st master
    o 1st master now knows that data I have received may not be correct one(since someone else accessed it)
    o if no other master changed it, exclusivity is maintiend
    o slave will give EXOKAY to the master#1.
  6. decode error
    analogy:
    998897 (pin code doesn’t exist)
    – If I send a post to this pin code, waht response do I get? who will respond?
    o post office will respond saying ‘pin’ doesn’t exist ===> Decode error(I am not able to decode the addr)
    AXI;
    – AXI interconnect responds saying ‘addr’ doesn’t exist in the address mapping.
    addr = 32’h1300
    o AXI interconnect doesn’t have this address mapped, then it gives ‘decode error’
    address mapping:
    o pin code mapping
    o KA: 500000 to 560000
  7. AXI VIP coding
    o what is VIP? why is it important?
    o VIP : Verification Intellectual Property
    o why VIP is important?
    o VLSI design flow
    o one major factor: Time2Market
    o How quickly company can come out with the product.
    o Iphone13 if it comes 2025 => will it be able to sell same number of phones? No
    o how to reduce time2Market?
    o reduce the time taken to do each step
    o one of the importnat steps which takes a lot of time: Functional verification
    o How to reduce this time taken by Functional verification?
    o Reduce the time taken to setup the TB
    o instead of taking 1 month time to setup TB, do it in 1 week time
    o how to achieve this?
    o DOn’t develop everything from scratch.
    o License(purchase) already developed IP’s for TB development
    o since IP’s are used for verification purpose, they are called as verification IP
    o If IP was meant for design purpose, then it is called as ‘design IP’
    o By using VIP’s we can reduce the time taken to develop the TB.
    o hence more time can be spent on developing the testcases => essentially imrpoves the quality of verification.
  8. Mobile SOC verification project
    o it may use 30 VIP’s
    o they won’t code anything from scratch.
  9. How VIPs are used in TB development?
    o I will approach Synopsys/Cadence/Mentor to purchase(license) of these VIPs
    o I will get AXI VIP license, AHB VIP license, USB VIP license, PCIe VIP license
    o why not APB VIP license?
    – since it is a simple protocol, team will develop this VIP.
    o AXI VIP will come with some configuration options
    o jsut set those options => it can updated specific to our requirements.
    o AXI protoocl 64 bit data bus => AXI VIP can be updated to work like 64 bit bus VIP.
  10. If IP is developved using SV, VIP
    if developed using UVM, UVC(Universal verification compoennt)

SESSION#4

  1. We implemented AXI VIP template enviornemtn
    o why is it called Template ?

Agenda;

  1. Development of VIP components
  2. current status
  1. next step
  1. 1st stage: complete AXI VIP template coding (skeletal structure)
    2nd stage: generate tx, get it in BFM, print various phase messages
  2. next step
    current code has two issues
    • axi_slave is empty
    • axi_slave port connectins are missing
  3. AXI SLave can be impelemtned
    • module
    • class
  4. to record a function
    escape mode
    qa => recording in to @a
    do one complete iteration
    exactly come to the point in next line, where we need to repeat same steps
    q => stop recording
    :map 3 @a
    mapping 3 to @a
    to do same function 20 times => 203
    to do same function 40 times => 403

8.
when did I start trainig?
– Nov/DEC => 5 months
– mon, cov?
– scorebaord?
– clocking block?
– modport?

what more to do?

SESSION#5

  1. why rready is not getting asserted(not becoming 1) for the 2nd beat

what is pending?

  1. DO write – read to same address.
    o compare the data
  2. Implement the monitor and coverage

notes:

  1. write tx
    addr = 16’hBFF3
    AWLEN = 6
    AWSIZE = 1
  2. Issues in current environemtn
    o read data is coming all 0’s ==> Fixed (mistake: using arlen_t instead of araddr_t)
    o write data is happneing 1 more time.
    o 1st beat read data is happneing for 3 clocks, hence it is making last data phase not happen.
    o FIx the handshaking issue
  3. How to fix handshaking issues between BFM and DUT?
    o use clocking block
    o user can use proper values input and output skew and make sure signal driving and sampling happens properly.
  4. how did we fix the issue?
    o by using input skew #0, we are able to make sure wready and rready are sampled by teh BFM at the same edge, instead of delaying by one clock cycle. Hence we are able to resolve sampling and driving issues.
  5. OUR TB is stable. => I have complete control on my TB
    o We can implement any testcases in 2 -3 minutes
Course Registration